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For Whom Is This Book Written? 


Crow’s Law: Do not think what you want to think until you know what you 
ought to know.! 


Linear algebra is a living, active branch of mathematical research which is central 
to almost all other areas of mathematics and which has important applications in all 
branches of the physical and social sciences and in engineering. However, in recent 
years the content of linear algebra courses required to complete an undergraduate 
degree in mathematics—and even more so in other areas—at all but the most ded- 
icated universities, has been depleted to the extent that it falls far short of what is 
in fact needed for graduate study and research or for real-world application. This 
is true not only in the areas of theoretical work but also in the areas of computa- 
tional matrix theory, which are becoming more and more important to the working 
researcher as personal computers become a common and powerful tool. Students 
are not only less able to formulate or even follow mathematical proofs, they are also 
less able to understand the underlying mathematics of the numerical algorithms they 
must use. The resulting knowledge gap has led to frustration and recrimination on 
the part of both students and faculty alike, with each silently—and sometimes not 
so silently—blaming the other for the resulting state of affairs. This book is written 
with the intention of bridging that gap. It was designed be used in one or more of 
several possible ways: 
(1) As a self-study guide; 
(2) As a textbook for a course in advanced linear algebra, either at the upper-class 
undergraduate level or at the first-year graduate level; or 
(3) As areference book. 
It is also designed to be used to prepare for the linear algebra portion of prelim 
exams or Ph.D. qualifying exams. 

This volume is self-contained to the extent that it does not assume any previ- 
ous knowledge of formal linear algebra, though the reader is assumed to have been 
exposed, at least informally, to some basic ideas or techniques, such as matrix ma- 
nipulation and the solution of a small system of linear equations. It does, however, 


'This law, attributed to John Crow of King’s College, London, is quoted by R.V. Jones in his book 
Most Secret War, Wordsworth, 1998 (ISBN 978-1853266997). 
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assume a Seriousness of purpose, considerable motivation, and modicum of mathe- 
matical sophistication on the part of the reader. 

The theoretical constructions presented here are illustrated with a large number of 
examples taken from various areas of pure and applied mathematics. As in any area 
of mathematics, theory and concrete examples must go hand in hand and need to be 
studied together. As the German philosopher Immanuel Kant famously remarked, 
concepts without precepts are empty, whereas precepts without concepts are blind. 

The book also contains a large number of exercises, many of which are quite 
challenging, which I have come across or thought up in over thirty years of teaching. 
Many of these exercises have appeared in print before, in such journals as Ameri- 
can Mathematical Monthly, College Mathematics Journal, Mathematical Gazette, 
or Mathematics Magazine, in various mathematics competitions or circulated prob- 
lem collections, or even on the internet. Some were donated to me by colleagues 
and even students, and some originated in files of old exams at various universities 
which I have visited in the course of my career. Since, over the years, I did not keep 
track of their sources, all I can do is offer a collective acknowledgement to all those 
to whom it is due. Good problem formulators, like the God of the abbot of Citeaux, 
know their own. Deliberately, difficult exercises are not marked with an asterisk or 
other symbol. Solving exercises is an integral part of learning mathematics and the 
reader is definitely expected to do so, especially when the book is used for self- 
study. Try them all and remember the “grook” penned by the Danish genius Piet 
Hein: Problems worthy of attack / Prove their worth by hitting back. 

Solving a problem using theoretical mathematics is often very different from 
solving it computationally, and so strong emphasis is placed on the interplay of the- 
oretical and computational results. Real-life implementation of theoretical results 
is perpetually plagued by errors: errors in modeling, errors in data acquisition and 
recording, and errors in the computational process itself due to roundoff and trun- 
cation. There are further constraints imposed by limitations in time and memory 
available for computation. Thus the most elegant theoretical solution to a problem 
may not lead to the most efficient or useful method of solution in practice. While no 
reference is made to particular computer software, the concurrent use of a personal 
computer equipped symbolic-manipulation software such as MAPLE, MATHEMAT- 
ICA, MATLAB, or MUPAD is definitely advised. 

In order to show the “human face” of mathematics, the book also includes a 
large number of thumbnail photographs of researchers who have contributed to the 
development of the material presented in this volume. 
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Notation and Terminology 


Sets will be denoted by braces, { }, between which we will either list the elements 
of the set or give a rule for determining whether something is an element of the set 
or not,! as in {x | p(x)}, which is read “the set of all x such that p(x)”. If a is an el- 
ement of a set A, we write a € A; if itis not an element of A, we write a ¢ A. When 
one enumerates the elements of a set, the order is not important. Thus {1, 2, 3, 4} 
and {4, 1,3, 2} both denote the same set. However, we often do wish to impose an 
order on sets the elements of which we enumerate. Rather than introduce new and 
cumbersome notation to handle this, we will make the convention that when we enu- 
merate the elements of a finite or countably-infinite set, we will assume an implied 
order, reading from left to right. Thus, the implied order on the set {1, 2, 3, .. .} is in- 
deed the usual one, whereas {4, 1, 3, 2} gives the first four positive integers, ordered 
alphabetically. The empty set, namely the set having no elements, is denoted by @. 
Sometimes we will use the word “collection” as a synonym for “set”, generally to 
avoid talking about “sets of sets”. 

A finite or countably-infinite selection of elements of a set A is a list. Members 
of a list are assumed to be in a definite order, given by their indices or by the im- 
plied order of reading from left to right. Lists are usually written without brackets: 
a1,..-, 4, though, in certain contexts, it will be more convenient to write them as 
ordered n-tuples (a1,...,a,). Note that the elements of a list need not be distinct: 
3, 1, 4, 1, 5, 9 is a list of six positive integers, the second and fourth elements of 
which are equal to 1. A countably-infinite list of elements of a set A is also often 
called a sequence of elements of A. The set of all distinct members of a list is called 
the underlying subset of the list. 

If A and B are sets, then their union AU B is the set of all elements that belong to 
either A or B, and their intersection A‘ B is the set of all elements belonging both 
to A and to B. More generally, if {A; | i € 2} is a (possibly-infinite) collection of 


‘Mathematically, these two ways of defining a set are equivalent, but philosophically and func- 
tionally they are not. Listing the elements of a set involves denotation whereas giving a rule for 
determining set membership involves connotation. This distinction becomes important when we 
attempt to use computers to manipulate sets. 
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sets, then |J icq Ai is the set of all elements that belong to at least one of the A; and 
M; <q Ai is the set of all elements that belong to all of the A;. If A and B are sets, 
then the difference set A ~\ B is the set of all elements of A which do not belong 
to B. 

A function f from a nonempty set A to a nonempty set B is a rule which assigns 
to each element a of A a unique element f(a) of B. The set A is called the domain 
of the function and the set B is called the range of the function. To denote that f is 
a function from A to B, we write f : A — B. To denote that an element b of B is 
assigned to an element a of A by f, we write f : at» b. (Note the different form 
of the arrow!) This notation is particularly helpful in the case that the function f is 
defined by a formula. Thus, for example, if f is a function from the set of integers 
to the set of integers defined by f : a+ a3, then we know that f assigns to each 
integer its cube. The set of all functions from a nonempty set A to a nonempty set 
B is denoted by B4. If f € B4 and if A’ is a nonempty subset of A, then a function 
f’ € B® is the restriction of f to A’, and f is the extension of f’ to A, if and only 
if f':a’ th f@) foralld’ € A’. 

Functions f and g in B4 are equal if and only if f(a) = g(a) for allae€ A. 
In this case, we write f = g. A function f € B4 is monic if and only if it assigns 
different elements of B to different elements of A, i.e., if and only if f(a1) 4 f (a2) 
whenever a # a2 in A. A function f € B4 is epic if and only if every element 
of B is assigned by f to some element of A. A function which is both monic and 
epic is bijective. A bijective function from a set A to a set B determines a bijective 
correspondence between the elements of A and the elements of B. If f: A— Bisa 
bijective function, then we can define the inverse function f~': B —> A defined by 
the condition that f —l(b) =a if and only if f(a) = b. This inverse function is also 
bijective. A bijective function from a set A to itself is a permutation of A. Note that 
there is always at least one permutation of any nonempty set A, namely the identity 
function a+> a. 

The Cartesian product A, x Az of nonempty sets A; and A? is the set of all 
ordered pairs (a1, a2), where a, € A; and az € Az. More generally, if Aj,..., An 
is a list of nonempty sets, then A, x --- x Ay, is the set of all ordered n-tuples 
(aj, ..., Qn) satisfying the condition that a; € A; for each | <i <n. Note that each 
ordered n-tuple (a1, ...,d@,) uniquely defines a function f: {1,...,n}— ea Aj 
given by f :i te qa; for each 1 <i <n. Conversely, each function f: {1,...,n}—> 
(J, A; satisfying the condition that f(i) € A; for 1 <i <n defines such an or- 
dered n-tuple, namely (f(1),..., f()). This suggests a method for defining the 
Cartesian product of an arbitrary collection of nonempty sets. If {A; | i € 2} is an 
arbitrary collection of nonempty sets, then the set [];-¢ Ai is defined to be the set 
of all those functions f from 92 to U;<q Ai satisfying the condition that f (i) € Aj 
for each i € §2. The existence of such functions is guaranteed by a fundamental 
axiom of set theory, known as the Axiom of Choice. A certain amount of contro- 
versy surrounds this axiom, since it leads to some very counter-intuitive results. 
Thus, for example, in 1924 Polish mathematicians Stefan Banach and Alfred Tarski 
showed that if the Axiom of Choice is assumed then any solid sphere can be split 
into finitely-many pieces which can be reassembled to form two solid spheres of the 
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same size as the original sphere. Therefore, there are mathematicians who prefer to 
make as little use of the Axiom of Choice as possible. In 1963, American mathe- 
matician P. J. Cohen showed that the Axiom of Choice is independent of the other 
axioms of Zermelo—Fraenkel set theory, and so one is—in principle—free to either 
assume it or its negation. Since we will need this axiom constantly throughout this 
book, we will always assume that it holds. 

In the foregoing construction, we did not assume that the sets A; were necessarily 
distinct. Indeed, it may very well happen that there exists a set A such that A; = A 
for all i € §2. In that case, we see that Tice Aj is just A®. If the set @ is finite, 
say 22 = {1,...,n}, then we write A” instead of A®. Thus, A” is just the set of all 
ordered n-tuples (a1,...,d,) of elements of A. 


Example The function f2 : N? > N given by 
es 1 ee) rs : 7 ‘ 
h:iipp 5 (i +j +i+ 2ij +3j) 
is bijective. For k > 2 we can define a bijective function f;, : N‘ + N inductively 
by 
fei, oid oli, fe-1@,..., i). 


We use the following standard notation for some common sets of numbers: 


N _ the set of all nonnegative integers, 
Z the set of all integers, 

Q the set of all rational numbers, 

R the set of all real numbers, 

C the set of all complex numbers. 


Other notion is introduced throughout the text, as is appropriate. See the Summary 
of Notation in Appendix A of the book. 


Fields 


The way of mathematical thought is twofold: the mathematician first proceeds in- 
ductively from the particular to the general and then deductively from the general 
to the particular. Moreover, throughout its development, mathematics has shown 
two aspects—the conceptual and the computational—the symphonic interleaving of 
which forms one of the major aspects of the subject’s aesthetic. 

Let us therefore begin with the first mathematical structure—numbers. By the 
Hellenistic times, mathematicians distinguished between two types of numbers: the 
rational numbers, namely those which could be written in the form = for some in- 
teger m and some positive integer n, and those numbers representing the geometric 
magnitude of segments of the line, which today we call real numbers and which, in 
decimal notation, are written in the form m.k,k2k3... where m is an integer and the 
k; are digits. The fact that the set Q of rational numbers is not equal to the set R of 
real numbers was already noticed by the followers of the early Greek mathemati- 
cian/mystic Pythagoras. On both sets of numbers we define operations of addition 
and multiplication which satisfy certain rules of manipulation. Isolating these rules 
as part of a formal system was a task first taken on in earnest by nineteenth-century 
British and German mathematicians. From their studies evolved the notion of a field, 
which will be basic to our considerations. However, since fields are not our primary 
object of study, we will delve only minimally into this fascinating notion. A seri- 
ous consideration of field theory must be deferred to an advanced course in abstract 
algebra. 

A nonempty set F together with two functions F x F — F, respectively called 
addition (as usual, denoted by +) and multiplication (as usual, denoted by - or by 
concatenation), is a field if the following conditions are satisfied: 

(1) (associativity of addition and multiplication): a + (b +c) = (a+b)+c and 
a(bc) = (ab)c for all a,b,c € F. 

(2) (commutativity of addition and multiplication): a + b = b + a and ab = ba for 
alla, be F. 

(3) (distributivity of multiplication over addition): a(b + c) = ab + ac for all 
a,b,ceF. 
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(4) (existence of identity elements for addition and multiplication): There exist dis- 
tinct elements of F', which we will denote by 0 and | respectively, satisfying 
a+0O=aandal=aforallae F. 

(5) (existence of additive inverses): For each a € F there exists an element of F, 
which we will denote by —a, satisfying a + (—a) = 0. 

(6) (existence of multiplicative inverses): For each 0 ~ a € F there exists an ele- 
ment of F’, which we will denote by a 


-1 


1 


, Satisfying aa =1. 


& 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Oberwolfach (Weber, 
Dedekind, Kronecker and Steinitz). 


The development of the abstract theory of fields is generally credited to the nineteenth- 
century German mathematician Heinrich Weber, based on earlier work by the German 
mathematicians Richard Dedekind and Leopold Kronecker. Another nineteenth-century 
mathematician, the British Augustus De Morgan, was among the first—along with French 
mathematician Francois Joseph Servois—to isolate the importance of such properties as 
associativity, distributivity, and so forth. The final axioms of a field are due to the twentieth- 
century German mathematician Ernst Steinitz. 


Note that we did not assume that the elements —a and a~! are unique, though 
we will soon prove that in fact they are. If a and b are elements of a field F’, we will 
follow the usual conventions by writing a — b instead of a + (—b) and ¢ instead 


of ab~!. Moreover, if 0 4 a € F and if n is a positive integer, then na denotes the 
sum a +----+ a (n summands) and a” denotes the product a---a (n factors). If n 
is a negative integer, then na denotes (—n)(—a) and a” denotes (a~!)~". Finally, 
if nm = 0 then na denotes the field element 0 and a” denotes the field element 1. For 
0=ae F, we define na = 0 for all integers n and a” = 0 for all positive integers n. 
The symbol 0* is not defined for k < 0. 

As an immediate consequence of the associativity and commutativity of addition, 
we see that the sum of any list a,, ..., d, of elements of a field F is the same, no mat- 
ter in which order we add them. We can therefore unambiguously write aj +---+dy. 
This sum is also often denoted by }~"_, a;. Similarly, the product of these elements 
is the same, no matter in which order we multiply them. We can therefore unam- 
biguously write aj ---d,. This product is also often denoted by [J/_, a;. Also, a 
simple inductive argument shows that multiplication distributes over arbitrary sums: 
ifaé F and bj,..., by isa list of elements of F then a()~/_, bi) = )7_, abj. 

We easily see that Q and R, with the usual addition and multiplication, are fields. 

A subset G of a field F is a subfield if and only if it contains 0 and 1, is closed 
under addition and multiplication, and contains the additive and multiplicative in- 
verses of all of its nonzero elements. Thus, for example, Q is a subfield of R. It is 
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easy to verify! that the intersection of a collection of subfields of a field F is again 
a subfield of F. 
We now want to look at several additional important examples of fields. 


Example Let C = R? and define operations of addition and multiplication on C by 
setting (a,b) + (c,d) = (a+c,b+d) and (a,b) - (c,d) = (ac — bd,ad + be). 
These operations define the structure of a field on C, in which the identity element 
for addition is (0,0), the identity element for multiplication is (1, 0), the additive 
inverse of (a, b) is (—a, —b), and 


b)-! a —b 

Bar (at a3) 

for all (0,0) 4 (a,b). This field is called the field of complex numbers. The set 
of all elements of C of the form (a, 0) forms a subfield of C, which we normally 
identify with R and therefore it is standard to consider R as a subfield of C. In 
particular, we write a instead of (a, 0) for any real number a. The element (0, 1) of 
C is denoted by 7. This element satisfies the condition that i 2 = (—1,0) and so it is 
often written as /—1. We also note that any element (a, b) of C can be written as 
(a,0) + b(0, 1) =a-+ bi, and, indeed, that is the way complex numbers are usually 
written and how we will denote them from now on. If z=a-+ bi, then a is the real 
part of z, which is often denoted by Re(z), while bi is the imaginary part of z, which 
is often denoted by Im(z). The field of complex numbers is extremely important in 
mathematics. From a geometric point of view, if we identify R with the set of points 
on the Euclidean line, as one does in analytic geometry, then it is natural to identify 
C with the set of points in the Euclidean plane. 


wy 


With kind permission of the Harvard Arts Museum (Descartes); With kind permission of ETH-Bibliothek 
Zurich, Image Archive (Euler); With kind permission of Bibliothéque nationale de France (Argand). 


The term “imaginary” was coined by the seventeenth-century French philosopher and math- 
ematician René Descartes. The use of i to denote /—1 was introduced by the eighteenth- 
century Swiss mathematician Leonhard Euler. The geometric representation of the com- 
plex numbers was first proposed at the end of the eighteenth century by the Norwegian 
surveyor Caspar Wessel, and later by the French accountant Jean-Robert Argand. It was 
studied in detail by the nineteenth-century Italian mathematician Giusto Bellavitis. 


'When a mathematician says that something is “easy to see” or “trivial”, it means that you are 
expected to take out a pencil and paper and spend some time—often considerable—checking it out 
by yourself. 
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If z=a-+ bi € C then we denote the complex number a — bi, called the complex 
conjugate of z, by Z z. It is easy to see that for all z, z’ € C we have z+ 7’ =Z+ Z/e, 


-—zZ=-Z, ZZ =Z: Zz, gla = (Z)~ 1 , and Z Z =z. The number zz equals a 24 2, which 
is a nonnegative real number and so has a square root in R, which we will denote 
by |z|. Note that |z| is nonzero whenever z 4 0. From a geometric point of view, 
this number is just the distance from the number z, considered as a point in the 
Euclidean plane, to the origin, just as the usual absolute value |a| of a real number 
a is the distance between a and 0 on the real line. It is easy to see that if y and z are 
complex numbers then | yz| = |y|-|z| and |y+z| < |y|+]|z|. Moreover, if z=a+ bi 


then 
z+%=2a <2la| =2Va2 <2/a2 +b? =2Iz). 


We also note, as a direct consequence of the definition, that |z| = |z| for every com- 
plex number z and so z~! = |z|~°z for allO Az €C. In particular, if |z| = 1 then 


2 SZ. 


Example The set Q? is a subfield of the field C defined above. However, it is also 
possible to define field structures on Q? in other ways. Indeed, let F = Q? and 
let p be a fixed prime integer. Define addition and multiplication on F by setting 
(a,b) + (c,d) = (a+c,b+d) and (a, b)- (c,d) = (ac + bdp,ad + be). 

Again, one can check that F is indeed a field and that, again, the set of all ele- 
ments of F of the form (a, 0) is a subfield, which we will identify with Q. More- 
over, the additive inverse of (a, b) € F is (—a, —b) and the multiplicative inverse of 


(0, 0) # (a, b) € F is 
a —b 
a2 — pb?’ a2 — pb? . 


(We note that a* — pb? is the product of the nonzero real numbers a + b./p and a — 
b,/p and so is nonzero.) The element (0, 1) of F satisfies (0, 1? = (p, 0) and so one 
usually denotes it by ./p and, as before, any element of F' can be written in the form 
a+b,/p, where a, b € Q. The field F is usually denoted by Q(,/p). Since there are 
infinitely-many distinct prime integers, we see that there are infinitely-many ways 
of defining different field structures on Q x Q, all having the same addition. 


Example Fields do not have to be infinite. Let p be a positive integer and let 
Z/(p) = {0,1,..., p — 1}. For each nonnegative integer n, let us, for the pur- 
poses of this example, denote the remainder after dividing n by p as [n],. Thus 
we note that [n], € Z/(p) for each nonnegative integer n and that [i], =i for all 
i € Z/(p). We now define operations on Z/(p) by setting [n]p + [k]p =[n +k] p 
and [1], - [k]) = [nk] p. It is easy to check that if the integer p is prime then Z/(p), 
together with these two operations, is again a field, known as the Galois field of 
order p. This field is usually denoted by GF(p). While Galois fields were first con- 
sidered mathematical curiosities, they have since found important applications in 
coding theory, cryptography, and modeling of computer processes. 
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These are not the only possible finite fields. Indeed, it is possible to show that for 
each prime integer p and each positive integer n there exists an (essentially unique) 
field with p” elements, usually denoted by GF(p”). 


With kind permission of Bibliothéque nationale de France 
(Galois); With kind permission of the American Mathemat- 
ical Society (Moore). 

The nineteenth-century French mathematical ge- 
nius Evariste Galois, who died at the age of 21, 
was the first to consider such structures. The study 
of finite and infinite fields was unified in the 1890s 
by Eliakim Hastings Moore, the first American- 
born mathematician to achieve an international 
reputation. 


Example Some important structures are “very nearly” fields. For example, let 
Roo = RU {oo}, and define operations H and L] on R, by setting 


min{a,b} ifa,beR, 
aHb=%b ifa=o, 
a ifb=o, 


and 


ea ifa,beR, 
allb= ; 
(ove) otherwise. 


This structure, called the optimization algebra, satisfies all of the conditions of a 
field except for the existence of additive inverses (such structures are known as semi- 
fields). As the name suggests, it has important applications in optimization theory 
and the analysis of discrete-event dynamical systems. There are several other semi- 
fields which have significant applications and which have been extensively studied. 


Another possibility of generalizing the notion of a field is to consider an algebraic 
structure which satisfies all of the conditions of a field except for the existence of 
multiplicative inverses, and to replace that condition by the condition that if a,b 4 0 
then ab 4 0. Such structures are known as integral domains. The set Z of all integers 
is the simplest example of an integral domain which is not a field. Algebras of 
polynomials over a field, which we will consider later, are also integral domains. In 
a course in abstract algebra, one proves that any integral domain can be embedded 
in a field. 

In the field GF(p) which we defined above, one can easily see that the sum 
1+---+ 1 (p summands) equals 0. On the other hand, in the field Q, the sum of 
any number of copies of | is always nonzero. This is an important distinction which 
we will need to take into account in dealing with structures over fields. We therefore 
define the characteristic of a field F to be equal to the smallest positive integer p 
such that 1 +----+ 1 (p summands) equals 0—if such an integer p exists—and to be 
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equal to 0 otherwise. We will not delve deeply into this concept, which is dealt with 
in courses on field theory, except to note that the characteristic of a field, if nonzero, 
always turns out to be a prime number, as we shall prove below. 

In the definition of a field, we posited the existence of distinct identity elements 
for addition and multiplication, but did not claim that these elements were unique. 
It is, however, very easy to prove that fact. 


Proposition 2.1 Let F be a field. 
(1) [fe is an element of F satisfying e+-a=a for alla € F thene=0; 
(2) [fu is an element of F satisfying ua =a foralla é F thenu=1. 


Proof By definition,e =e+0=Oandu=ul=1. 


Similarly, we prove that additive and multiplicative inverses, when they exist, are 
unique. Indeed, we can prove a stronger result. 


Proposition 2.2 [fa and b are elements of a field F then: 
(1) There exists a unique element c of F satisfying a+c=b. 
(2) Ifa £0 then there exists a unique element d of F satisfying ad = b. 


Proof (1) Choose c = b — a. Then 


a+c=a+(b—a)=a+[b+ (—a)] 
=a+[(—a)+b]=[a+ (—a)]+b=0+b=b. 


Moreover, if a + x = b then 


x=0+x=[(—a)+a]+x 
=(-a)+ (@+x)=(-a)+b=b-a, 
proving uniqueness. 


(2) Choose d = a~!b. Then ad = a(a~!b) = (aa~!)b = 1b = b. Moreover, if 
ay =b then y = ly = (a~!a)y =a! (ay) =a7'b, proving uniqueness. 


We now summarize some of the elementary properties of fields, which are all we 
will need for our discussion. 
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Proposition 2.3 [f a,b, and c are elements of a field F then: 
(1) 0a =0; 


(2) (-lha=~—a; 
(3) a(—b) = —(ab) = (—a)b; 
(4) —(-a) =a; 


(5) (—a)(—b) = ab; 

(6) —(a+ b) = (—a) + (—8); 

(7) a(b—c) =ab—ac; 

(8) Ifa 40 then (a~!)-! =a; 

(9) Ifa,b 40 then (ab)! =b-'a7!; 
(10) Ifa+c=b+c thena=b; 

(11) Ifc 40 and ac = bc thena=b; 
(12) [fab =0 thena=b orb=0. 


Proof (1) Since 0a + 0a = (0 + 0)a = Oa, we can add —(0a) to both sides of the 
equation to obtain 0a = 0. 

(2) Since (-l)a +a = (—Da + la = [(—1) + 1]a = 0a = 0 and also (—a) + 
a = 0, we see from Proposition 2.2 that (—l)a = —a. 

(3) By (2) we have a(—b) = a[(—l)b] = (—l)ab = —(ab) and similarly 
(—a)b = —(ab). 

(4) Since a + (—a) = 0 = —(—a) + (—a), this follows from Proposition 2.2. 

(5) From (3) and (4) it follows that (—a)(—b) = a[—(—b)] = ab. 

(6) Since (a + b) + [(—a) + (—b)] =a + b + (—a) + (—b) = 0 and (a + b) + 
[—(a + b)] = 0, the result follows from Proposition 2.2. 

(7) By (3) we have a(b — c) =ab+a(—c) =ab+[-(ac)] =ab — ac. 

(8) Since (a~!)~!a~! = 1 = aa™!, this follows from Proposition 2.2. 

(9) Since (a~!b~!) (ba) = a~!ab~'b = 1 = (ab)! (ba), the result follows from 
Proposition 2.2. 

(10) This is an immediate consequence of adding —c to both sides of the equa- 
tion. 

(1 a This is an immediate consequence of multiplying both sides of the equation 
byc. 

(12) If b = 0 we are done. If b 4 0 then by (1) it follows that multiplying both 
sides of the equation by b~! will yield a = 0. 


The following two propositions are immediate consequences of Proposition 2.3. 


Proposition 2.4 Let a be a nonzero element of a finite field F having q ele- 
ments. Then a~! = a4~?. 


Proof If q =2 then F = GF(2) and a = 1, so the result is immediate. Hence we 
can assume q > 2. Let B = {a,...,dg—1} be the nonzero elements of F’, writ- 
ten in some arbitrary order. Then aa; 4 aay for i #h since, were they equal, 
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we would have a; = a~!(aa;) = a~! (aay) = an. Therefore B = {aa,.. -,Aq—\} 
and so eae aj = asa (aaj) = a4 Ty aj]. Moreover, this is a product of 
nonzero elements of F and so, by Proposition 2.3(12), is also nonzero. Therefore, 
by Proposition 2.3(11), 1 = a7~!, and so aa~! = 1 = a4~! = a(a4~”), implying 
that a~! = a4-?. 


Proposition 2.5 [f F is a field having characteristic p > 0, then p is prime. 


Proof Assume that p is not prime. Then p = hk, where 0 < h,k < p. Therefore, 
a=h\f and b=k\1 f are nonzero elements of F. But ab = (hk) |r = plr =0, 
contradicting Proposition 2.3(12). 


Of course, one can use Proposition 2.3 to prove many other identities among 
elements of a field. A typical example is the following 


Proposition 2.6 (Hua’s identity) [fa and b are nonzero elements of a field 
F satisfying a4 b~' then 


a—aba= (a7! + [b-! _ ie ae 


Proof We note that 


a'+(b!- a) | =a'[(b-' —a) +a](b' - a)' 


= a 'b (bp! = a), 


so (a7! + [b7! — a]7!)—! = (6b! — aba =a — aba. 


Loo-Keng Hua was a major twentieth-century Chinese mathemati- 
cian. 
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Exercises 


Exercise 1 

Let F bea field and let G = F x F. Define operations of addition and multiplica- 
tion on G by setting (a, b) + (c,d) = (a+c,b+d) and (a, b)- (c, d) = (ac, bd). 
Do these operations define the structure of a field on G? 


Exercise 2 
Let K be the set of the following four-tuples of elements of GF(3): 


(0, 0, 0, 0), (1, 2, 1, 1), 2, 1,2, 2), (1, 0, 0, 1), (2, 2, 1, 2), 
(2,0, 0, 2), (0, 1, 2,0), (0, 2, 1,0), (1, 1,2, 1). 


Define operations of addition and multiplication on K so that it becomes a field. 


Exercise 3 

Let r € R and let 04s € R. Define operations H and E] on R x R by (a, Db) 
(c,d) = (a+c, b+d) and (a, b) H(c, d) = (ac — bd(r? +s”), ad + bc + 2rbd). 
Do these operations, considered as addition and multiplication, respectively, de- 
fine the structure of a field on R x R? 


Exercise 4 

Define a new operation + on R by setting a + b = a*b. Show that R, on which 
we have the usual addition and this new operation as multiplication, satisfies all 
of the axioms of a field with the exception of one. 


Exercise 5 

Let 1 <t € R and let F = {a € R| a < 1}. Define operations @ and © on F as 
follows: 

(1) a@®b=a+b—-ab foralla,be F; 

(2) a@b=1 — 108 —@) log, (I—P) for alla, b € F. 

For which values of t does F,, together with these operations, form a field? 


Exercise 6 
Show that the set of all real numbers of the form a + b/2 + c/3 +d V6, where 
a,b,c,d €Q, forms a subfield of R. 


Exercise 7 
Is {a+ bV 15 | a, b € Q} a subfield of R? 


Exercise 8 
Show that the field R has infinitely-many distinct subfields. 


Exercise 9 
Let F be a field and define a new operation « on F by settinga*b=a+b-+ab. 
When is (F, +, *) a field? 
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Exercise 10 

Let F be a field and let G, be the subset of F consisting of all elements which 
can be written as a sum of n squares of elements of F. 

(1) Is the product of two elements of G2 again an element of G2? 

(2) Is the product of two elements of G4 again an element of G4? 


Exercise 11 
Let t = /2 € R and let S be the set of all real numbers of the form a + bt + ct?, 
where a, b,c € Q. Is S a subfield of R? 


Exercise 12 
Let F be a field. Show that the function at> a7! is a permutation of F \ {Or}. 


Exercise 13 
Show that every z € C satisfies 


w44=(2-1-i(-it+i@+1+ie+1-i). 


Exercise 14 

In each of the following, find the set of all complex numbers z = a+ bi satisfying 
the given relation. Note that this set may be empty or may be all of C. Justify your 
result in each case. 

(a) 2 =35(1 +iv3); 

(b) (V2)|z| = lal + lol; 

(c) [zg] +z=2+i; 

(d) 24 =2-(V12)i; 

(ce) t= -4. 


Exercise 15 
Let y be a complex number satisfying || < 1. Find the set of all complex num- 
bers z satisfying |z — y| < |1 — yz]. 


Exercise 16 
Let z1, Z2, and z3 be complex numbers satisfying the condition that |z;| = 1 for 
i= 1,2, 3. Show that |zjz2 + 2123 + 2223| = |z1 + Z2 +23]. 


Exercise 17 
For any z1, Z2 € C, show that |z1|? + |z2|? — 2172 — 122 = |z1 — z2I’. 


Exercise 18 
Show that |z+ 1| <|z+1|* + |z| for all ze C. 


Exercise 19 
If z EC, find w € C satisfying w? = z. 
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Exercise 20 
Define new operations o and ¢ on C by setting y oz = |y|z and 


_ fo if y=0, 
eo piyz otherwise 


for all y, z € C. Is it true that wo (yoz)=(wey)o(woz) and wo(yoz) = 
(woy)o (woz) forall w, y,z eC? 


Exercise 21 
Let 0 4 z € C. Show that there are infinitely-many complex numbers y satisfying 
the condition yy = zz. 


Exercise 22 
(Abel’s inequality) Let z1,...,Z, be a list of complex numbers and, for each 
1<k<n, lets, = papa zj. For real numbers aj,..., a, satisfying a, > a2 > 


+++ > ay > 0, show that | }7"_, aizi| < a1 (max) <k<n |s¢]). 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The nineteenth-century Norwegian mathematical genius Niels Henrik 
Abel died tragically at the age of 26. 


Exercise 23 
Let 0 4 zo € C satisfy the condition |zo| < 2. Show that there are precisely two 


complex numbers, z; and Z2, satisfying |z;| + |z2| = 1 and zj + z2 = Zo. 


Exercise 24 
If p is a prime positive integer, find all subfields of GF(p). 


Exercise 25 
Find 107! in GF(33). 


Exercise 26 
Find elements c, d ~4 +1 in the field Q(/5) satisfying cd = 19. 
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Exercise 27 
Let F be the set of all real numbers of the form 


a+b(V5) +e(75)°, 
where a, b,c € Q. Is F a subfield of R? 


Exercise 28 
Let p be a prime positive integer and let a € GF(p). Does there necessarily exist 
an element b of GF(p) satisfying b> = a? 


Exercise 29 

Let F = GF(11) and let G = F x F. Define operations of addition and multi- 
plication on G by setting (a, b) + (c,d) = (a+c,b+d) and (a,b)- (c,d) = 
(ac + 7bd, ad + bc). Do these operations define the structure of a field on G? 


Exercise 30 

Let F be a field and let G be a finite subset of F' \ {0} containing 1 and satisfying 
the condition that if a,b € F then ab~! € G. Show that there exists an element 
c € G such that G = {c! |i > 0}. 


Exercise 31 
Let F be a field satisfying the condition that the function a +> a 
of F. What is the characteristic of F? 


? is a permutation 


Exercise 32 
Is Z/(6) an integral domain? 


Exercise 33 
Let F = {a+bJ/5 € Q(/5) | a,b € Z}. Is F an integral domain? 


Exercise 34 
Let F be an integral domain and let a € F satisfy a? = a. Show that a = 0 or 
a=1. 


Exercise 35 
Let a be a nonzero element in an integral domain F. If b # c are distinct elements 
of F, show that ab £ac. 


Exercise 36 

Let F be an integral domain and let G be a nonempty subset of F containing 0 
and | and closed under the operations of addition and multiplication in F. Is G 
necessarily an integral domain? 


Exercise 37 
Let U be the set of all positive integers and let F be the set of all functions 
from U to C. Define operations of addition and multiplication on F by setting 
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fte:kh f(ki)+ g(k) and fg: kh Diy=e FORM) for all k € U. Is F, to- 
gether with these operations, an integral domain? Is it a field? 


Exercise 38 

Let F be the set of all functions f from R to itself of the form f : th 
ye e-1 lak cos(kt) + by sin(kt)], where the ag, and bx are real numbers and n 
is some positive integer. Define addition and multiplication on F by setting 
ftege:tre fOH+ef and fg:trh f@g(t) for all te R. Is F, together 
with these operations, an integral domain? Is it a field? 


Exercise 39 
Show that every integral domain having only finitely-many elements is a field. 


Exercise 40 

Let F be a field of characteristic other than 2 in which there exist elements 
d|,...,n Satisfying )~"_, a? = —1. (This happens, for example, in the case 
F= .) Show that for any c € F there exist elements bj, ..., by of F satisfying 
c= Vint Oj. 


Exercise 41 
Let p be a prime integer. Show that for each a € GF(p) there exist elements b 
and c of GF(p), not necessarily distinct, satisfying a = b? + c?. 


Exercise 42 
Let F be a field in which we have elements a, b, and c (not necessarily distinct) 
satisfying a? + b* + c? = —1. Show that there exist (not necessarily distinct) 


elements d and e of F, satisfying d* + e* = —1. 


Exercise 43 
Is every nonzero element of the field GF(5) in the form 2' for some positive 
integer i? What happens in the case of the field GF(7)? 


Exercise 44 
Find the set of all fields F in which there exists an element a satisfying the 
condition that a+ b=<a for all b € F ~ {a}. 


Exercise 45 

(Binomial formula) If a and b are elements of a field F,, and if n is a positive 
integer, show that (a + b)"” = "f_o ({)akb"-*. 

Exercise 46 


Let F be a field of characteristic p > 0. Show that the function y: F — F 
defined by y : at» a? is monic. 
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Exercise 47 

Let a and b be nonzero elements of a finite field F, and let m and n be positive 
integers satisfying a” = b” = 1. Show that there exists a nonzero element c of F 
satisfying c = 1, where k is the least common multiple of m and n. 


Exercise 48 
If a is a nonzero element of a field F, show that (—a)~! = —(a7!). 


Exercise 49 

Let F = GF(7) and let K = F x F. Define addition and multiplication on K by 
setting (a,b) + (c,d) = (a+ b,c+d) and (a,b) - (c,d) = (ac — bd, ad + be). 
Do these operations turn K into a field? What happens if F = GF(5)? 


Exercise 50 

A field F is orderable if and only if there exists a subset P closed under addition 
and multiplication such that for each a € F precisely one of the following condi- 
tions holds: (i) a = 0; (ii) a € P; (ii) —a € P. Show that GF(5) is not orderable. 


Exercise 51 

Let F be a field and let K be the set of all functions f € F% satisfying the 
condition that there exists an integer (perhaps negative) ny such that f(i) = 0 
for all i < nf. Define operations of addition and multiplication on K by setting 
ft+g:iP f@O+e@ and fg:inp pe a f()g(h). Show that K is a field, 
called the field of formal Laurent series over F 


Exercise 52 
Let F be a field. Find A = {(x, y) € F? |x*+y?=]}. 


Exercise 53 
Let F be a field having characteristic p > 0 and let c € F. Show that there is at 
most one element b of F satisfying b? =c. 


Exercise 54 

A ternary ring is a set R containing distinguished elements 0 and 1, together 

with a function 6 : R* — R satisfying the following conditions: 

(1) 61, a, 0) = @(a, 1,0) =a forallae R; 

(2) O(a, 0, c) =0(0, a,c) =c forall ce R; 

(3) If a,b,c € R then there is a unique element y of R satisfying O(a, b, y) =c; 

(4) Ifa,a’,b,b’ € R witha 4a’ then there is a unique element x of R satisfying 
O(x,a,b) = 0(x,a’,b’); 

(5) If a,a’,b,b’ € R with a 4’ then there are unique elements x and y of R 
satisfying 0(a,x, y)=b and O(a’, x, y) =D’. 


These series were first studied by the nineteenth-century French engineer and mathematician, 
Pierre Alphonse Laurent. 
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Such structures have applications in projective geometry. If F is a field, show that 
we can define a function @ : F? > F in sucha way that F becomes a tertiary ring 
(with 0 and | being the neutral elements of the field). 


Exercise 55 

For h = 1,2, 3, let z, = ay + byi be a complex number satisfying |z,| = 1. As- 
sume, moreover, that 4 z; = 0. Show that the points (a, by) are the vertices 
of an equilateral triangle in the Euclidean plane. 


Vector Spaces Over a Field 


If n > 1 is an integer and if F is a field, it is natural to define addition on the set F” 
componentwise: 


(a1,..-,Gn) + (O1,.--, On) = (a1 + O1,..-,4n + bn). 


More generally, if 2 is any nonempty set and if F® is the set of all functions from 
2 to the field F, we can define addition on F® by setting f + g:it f(i)+ g(i) 
for each i € §2. Given these definitions, is it possible to define multiplication in such 
a manner that F” or F®@ will become a field naturally containing F as a subfield? 
We have seen that ifn = 2 andif F = R or F = Q, this is possible—and, indeed, in 
the latter case there are several different methods of doing it. If F = GF(p) then it 
is possible to define such a field structure on F” for every integer n > 1. However, 
in general the answer is negative—as we will show in a later chapter for the specific 
case of R*, where k > 2 is an odd integer. Nonetheless, it is possible to construct 
another important and useful structure on these sets, and this structure will be the fo- 
cus of our attention for the rest of this book. We will first give the formal definition, 
and then look at a large number of examples. 

Let F be a field. A nonempty set V, together with a function V x V — V called 
vector addition (denoted, as usual, by +) and a function F x V > V called scalar 
multiplication (denoted, as a rule, by concatenation) is a vector space over F if the 
following conditions are satisfied: 

(1) (associativity of vector addition): v.+(w+y)=(v+w)+yforallv,w,yeV. 

(2) (commutativity of vector addition): v-++-w=w-+v forallv,weV. 

(3) (existence of a identity element for vector addition): There exists an element Oy 
of V satisfying the condition that v + Oy =v forallue V. 

(4) (existence of additive inverses): For each v € V there exists an element of V, 
which we will denote by —v, which satisfies v + (—v) = Oy. 

(5) (distributivity of scalar multiplication over vector addition and of scalar multi- 
plication over field addition): a(v + w) =av + aw and (a+ b)v =av+bv for 
alla,be Fandv,we V. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 21 
DOI 10.1007/978-94-007-2636-9_3, © Springer Science+Business Media B.V. 2012 
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(6) (associativity of scalar multiplication): (ab)v = a(bv) for all a,b € F and 
veV. 

(7) (existence of identity element for scalar multiplication): |v = v for all v € V. 

The elements of V are called vectors and the elements of F are called scalars. 


With kind permission of the Manuscripts & Archives, Yale University (Gibbs); © the estate of Oliver Heaviside. 
Reproduced with kind permission of Alan Heather (Heaviside); With kind permission of Special collections, 
Fine Arts Library, Harvard University (Maxwell). 


The theory of vector spaces was developed in the 1880s by the American engineer and 
physicist, Josiah Willard Gibbs and the British engineer Oliver Heaviside, based on the 
work of the Scottish physicist James Clerk Maxwell, the German high-school teacher 
Herman Grassmann, and the French engineer Jean Claude Saint-Venant. 


Example Note that condition (7), apparently trivial, does not follow from the other 
conditions. Indeed, if we take V = F but define scalar multiplication by av = Oy 
for alla € F and v € V, we would get a structure which satisfies conditions (1)—(6) 
but not condition (7). 


If v, w € V we again write v — w instead of v + (—w). As we noted when we 
talked about fields, if vj, ..., uv, is a list of vectors in a vector space V over a field F, 
the associativity of vector addition allows us to unambiguously write vj +---+ Up, 
and this sum is often denoted by }“"_, vj. Moreover, if a € F is a scalar then we 
surely have a()-y_; vj) = ()-7_, av). Similarly, if a1, ..., am isa list of scalars and 
if v € V, then we have (}>7_, a;)v = )-;_, ajv. We will also adopt the convention 
that the sum of an empty set of vectors is equal to Oy. 

Clearly, any field F is a vector space over itself, where we take the vector addition 
to be the addition in F and scalar multiplication to be the multiplication in F. 

We also note an extremely important construction. Let F be a field and let 2 be 
a nonempty set. Assume that, for each? € 92, we are given a vector space V; over F, 
the addition in which we will denote by +; (the vector spaces V; need not, however, 
be distinct from one another). Recall that []; <9 V; is the set of all those functions f 
from 92 to Ujeg Vi which satisfy the condition that f(i) € V; for each i € 2. We 
now define the structure of a vector space on Tl eg Vi as follows: if f, g € Tl <a Vi 
then f + g is the function in [];<9 Vi given by f +g:ite f(i) +i g(i) for each 
i € 2. Moreover, if a € F and f €[];<q Vi, then af is the function in [];-9 Vi 
given by af :ite a[f(i)] for each i € Q2. It is routine to verify that all of the 
axioms of a vector space are satisfied in this case. For example, the identity element 
for vector addition is just the function in [];<¢ Vj given by it Oy, for eachi € Q. 
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This vector space is called the direct product of the vector spaces V; over F. If 
the set §2 is finite, say (2 = {1,...,n}, then we often write V; x --- x V, instead 
of [eg V;- If all of the vector spaces V; are equal to the same vector space V, 
then we write V® instead of ice V; and if 22 = {1,...,n} we write V” instead 
of V®. Note that a function f from a finite set 2 = {1,...,} to a vector space V 
is totally defined by the list f(1), f(2),..., f(m) of its values. Conversely, any list 
V1,.--, Un Of elements of V uniquely defines such a function f given by f :it> v;. 
Therefore, this notation agrees with our previous use of the symbol V” to denote 
sets of n-tuples of elements of V. However, to emphasize the vector space structure 
Ul 
here, we will write the elements of V” as columns of the form : |, where the 
Un 
v; are (not necessarily distinct) elements of V. Usually, we will consider the case 
V = F. Vector addition and scalar multiplication in V” are then defined by the rules 
Vv} WI vy~+uUy v1] CU 
+l: |= : andc| : |=]| : 
Un Wn Un + Wp Un CUn 
The “classical” study of vector spaces centers around the spaces R”, the vectors 
in which are identified with the points in n-dimensional Euclidean space. However, 
other vector spaces also have important applications. Vector spaces of the form C” 
are needed for the study of functions of several complex variables. In algebraic 
coding theory, one is interested in spaces of the form F”, where F is a finite field. 
The vectors in this space are words of length n and the field F is the alphabet in 
which these words are written. Thus, one choice for F is the Galois field GF(2’), 
the 256 elements of which are identified with the 256 ASCII symbols. 


© National Maritime Museum, Greenwich, London (Gali- 
lei); With kind permission of Frommann—Holzboog Publish- 
ers (Bolzano). 

The first explicit statement of the geometric “par- 
allelogram law” for adding geometric vectors 
was given by the sixteenth-century Pisan scien- 
tist Galileo Galilei. This idea was extended at the 
beginning of the nineteenth century by Bohemian 
priest Bernard Bolzano. 


Let V be a vector space, let k and n be positive integers, and let 2 = {(i, j) | 1< 

i <k, 1 <j <n}. There exists a bijective correspondence between V® and the set 
UIT owes) UI 

of all rectangular arrays of the form | : ~°.,.  : | in which the entries v;; are 


Ukl «++ Ukn 
elements of V. Such an array is called a k x n matrix over V. We will denote the set 
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of all such matrices by Mzy,(V). Addition in Mzyn(V) is given by 


UIT «ee OVI Wil .--) Win VUj~tWi .--. Vin + Win 


Uk1 +++ Ukn Wki +++ Wkn Ue + Wate) Ukn + Wkn 


and scalar multiplication in Mxx(V) is given by 


UII... Vin CU]... CUin 


Ukl +++  Ukn CUk1 «++ CUkn 


The identity element for vector addition in Mxx,(V) is the 0-matrix O, all entries 
of which are equal to Oy. Note that V” = M) x1 (V). 


The term “matrix” was first coined by the nineteenth-century British 
mathematician James Joseph Sylvester, one of the major researchers 
in the theory of matrices and determinants. 


If V is a vector space and if 2 =N, then the elements of V® are infinite se- 
quences [vo, v1,...] of elements of V. We will denote this vector space, which we 
will need later, by V. Again, the space of particular interest will be F°. 


Example If F is a subfield of a field K, then K is a vector space over F, with 
addition and scalar multiplication just being the corresponding operations in K. 
Thus, in particular, we can think of C as a vector space over R and of R as a vector 
space over Q. 


Example Let A be a nonempty set and let V be the collection of all subsets of A. 
Let us define addition of elements of V as follows: if B and C are elements of V 
then B+ C =(BUC)\~ (BNC). This operation is usually called the symmetric 
difference of B and C. This definition turns V into a vector space over GF(2), where 
scalar multiplication is defined by OB = © and 1B = B for all B € V. This is ac- 
tually just a special case of what we have seen before. Indeed, we note that there 
is a bijective function from V to GF(2)4 which assigns to each subset B of A its 
characteristic function, namely the function xg defined by 


1 ifaeB, 


[ab ; 
XB 0 otherwise, 


and it is easy to see that x4 + xB = xA+B, while x4xB = XANB- 
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Proposition 3.1 Let V be a vector space over a field F .. 
(1) IfzeéV satisfies z+v=v forallv € V thenz=0y. 
(2) Ifv, w € V then there exists a unique element y € V satisfying v+y=w. 


Proof The proof is similar to the proofs of Proposition 2.1(1) and Proposi- 
tion 2.2(1). 


Proposition 3.2 Let V be a vector space over a field F. If v, w € V and if 
aé F, then: 


(1) a0y =0y; 

(2) Ov=0y; 

(3) (-l)vu=—v; 

(4) (—a)v = —(av) =a(—v); 
(5) —(-v) =v; 


(6) av = (—a)(—v); 

(7) -—v+w)=-v-uU; 

(8) a(v—w)=av—avw; 

(9) Ifav =0y then either v= Oy ora=0. 


Proof The proof is similar to the proof of Proposition 2.3. 


Let V be a vector space over a field F. A nonempty subset W of V is a subspace 
of V if and only if it is a vector space in its own right with respect to the addition and 
scalar multiplication defined on V. Thus, any vector space V is a subspace of itself, 
called the improper subspace; any other subspace is proper. Also, {Ov} is surely a 
subspace of V, called the trivial subspace; any other subspace is nontrivial. 
Note that the two conditions for a nonempty subset of a vector space to be a 
subspace are independent: the set of all vectors in R? all entries of which are integers 
is closed under vector addition but not under scalar multiplication; the set of all 
a 

vectors | b | € R? satisfying abc = 0 is closed under scalar multiplication but not 
c 

under vector addition. 


Example Let V be a vector space over a field F and let S2 be a nonempty set. We 
have already seen that the set V® of all functions from @ to V is a vector space 
over F. If A is a subset of @ then the set {f ¢ V? | f(i) = Oy for alli € A} isa 
subspace of V%. In particular, if k <n are positive integers, then we can think of 
VK as being a subspace of V”, by identifying it with 
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U1 
EV" | ups1 =--- =U, =Oy }. Note that if y € V, then {f € V® | 
Un 


f (i) =y for all i € A} is not a subspace of V? unless y = Oy. 


Example Let {V; |i € 2} be a collection of vector spaces over a field F’. The set 
of all functions f € [];<¢ Vi satisfying the condition that f(i) 4 Oy, for at most 
finitely-many elements i of @ is a subspace of [ |; <9 Vi, called the direct coproduct 
of the spaces V; and denoted by [| |;- Vi. The direct coproduct is a proper subset of 
Tlieq Vi when and only when the set 2 is infinite. If each of the spaces V; is equal 
to a given vector space V, we write V‘?) instead of Llieg Vi- 


Example If V is a vector space over a field F and if v € V, then the set Fv = {av | 
a € F} is asubspace of V which is contained in any subspace of V containing v. 


Example Let R be the field of real numbers and let S2 be either equal to R, to some 
closed interval [a, b] on the real line, or to a ray [a, o©) on the real line. We have 
already seen that the set R® of all functions from @ to R is a vector space over R. 
The set of all continuous functions from £2 to R is a subspace of this vector space, 
as are the set of all differentiable functions from 2 to R, the set of all infinitely- 
differentiable functions from 2 to R, and the set of all analytic functions from £2 
to R. If a < b are real numbers, we will denote the space of all continuous functions 
from the closed interval [a, b] to R by C(a, b). If a € R we will denote the space of 
all continuous functions from [a, co) to R by C(a, oo). These spaces will be very 
important to us later. 


Proposition 3.3 If V is a vector space over a field F , then a nonempty subset 
W of V is a subspace of V if and only if it is closed under addition and scalar 
multiplication. 


Proof If W is a subspace of V then it is surely closed under addition and scalar 
multiplication. Conversely, suppose that it is so closed. Then for any w € W we 
have Oy = Ow € W and —w = (—1)w € W. The other conditions are satisfied in W 
since they are satisfied in V. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The first fundamental research in spaces of functions was done by the 
German mathematician Erhard Schmidt, a student of David Hilbert, 
whose work forms one of the bases of functional analysis. 
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Proposition 3.4 If V is a vector space over a field F, and if {W; |i € Q} isa 
collection of subspaces of V, then ()\;<q Wi is a subspace of V. 


Proof Set W = ice W;. If w, y € W then, for eachi € 2, we have w, y € W; and 
sow+ ye W;. Thus w+ ye W. Similarly, if a € F and w € W then aw € W; for 
each i € §2,and so aw eé W. 


We will also set the convention that the intersection of an empty collection 
of subspaces of V is V itself. Subspaces W and W’ are disjoint if and only if 
WW’ = {Oy}. More generally, a collection {W; | i € 2} of subspaces of V is 
pairwise disjoint if and only if W; 1 W; = {Ov} for i ~ 7 in 2. (Note that disjoint- 
ness of subspaces of a given space is not the same as disjointness of subsets!) 


Now let us look at a very important method of constructing subspaces of vector 
spaces. Let D be a nonempty set of elements of a vector space V over a field F. 
A vector v € V is a linear combination of elements of D over F if and only if there 
exist elements vj,..., Vv, of D and scalars aj,..., a, in F such that v = ae AjVj- 
We will denote the set of all linear combinations of elements of D over F by FD. 
Note that if v € V then F'{v} is the set Fv which we defined earlier. 

It is clear that if D is a nonempty set of elements of a vector space V over a 
field F then D C FD. Also, Oy € FD for any nonempty subset D of V, and it 
is the only vector belonging to each of the sets FD. To simplify notation, we will 
therefore define F@ to be {Oy}. If D’ C D then surely FD’ C FD. We also note 
that FD = F(D U {0y}) for any subset D of V. 


1 0 0 3 
Example If D= O;,) 1 and D/ = 2/,]3 are subsets of IR?, then 
0 0 0 0 
a 
FD=FD'= b || a,b €R}. Indeed, 
0 


P 1 0 pg POl.- 212 
plo Ole 4 = ) 21+213] foralla,deR. 
0 0 o| 310 
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4 0 2 2 
2|=1]0]4+1]2}41] 0 
4 4 0 0 

2 y) 1 

= 1-0.) eI) 2 4441 

0 0 1 


Thus we see that there may be several ways of representing a vector as a linear 
combination of elements of a given subset of a vector space. 


Proposition 3.5 Let D be a subset of a vector space V over a field F . Then: 
(1) FD is a subspace of V; 

(2) Every subspace of V containing D also contains F D; 

(3) FD is the intersection of all subspaces of V containing D. 


Proof If D= © then FD = {Oy} and we are done. Thus we can assume that D is 
nonempty. It is an immediate consequence of the definitions that the sum of two lin- 
ear combinations of elements of D over F is again a linear combination of elements 
of D over F, and that the product of a scalar and a linear combination of elements 
of D over F is again a linear combination of elements of D over F’. This proves (1). 
Moreover, (2) is an immediate consequence of (1) and Proposition 3.3, while (3) 
follows directly from (2). 


If D is a subset of a vector space V over a field F then the subspace FD of V is 
called the subspace generated or spanned by D, and the set D is called a generating 
set or spanning set for this subspace. In particular, we note that @ is a generating 
set for {Ov}. 


1 0 0 
Example Let F bea field. Then A = O;,;1],] 0 is a generating set for 
0 0 1 
1 1 0 
F? over F. The set B = 1/,);0),]1 is also a generating set for F? if the 
0 1 1 
1 
characteristic of F is other than 2, but not for F = GF(2), since | 0 | ¢ GF(2)B. 
0 
1 0 1 
The set D = O;,;1y,] 1 is not a generating set for F> for any field F 


since |} 0} g FD. 
1 
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Often, in applications, we need to restrict ourselves to linear combinations of 
special type. For example, let V be a vector space over a field F and let D be a 
nonempty subset of V. An affine combination of elements of D is an element of 
V of the form ys aj;vj;, where the v; are elements of D and the a; are scalars 
satisfying )~_, a; = 1. This is usually interpreted as a weighted average of the 
vectors v;. The set of all affine combinations of elements of D is called the affine 
hull of D and is denoted by affh(D). In general, this is not a subspace of V. One 
can, however, easily verify that affh(affh(D)) = affh(D) for any set D. 


Proposition 3.6 Let V be a vector space over a field F and let D, and D2 
be subsets of V satisfying Di C D2 C FD,. Then F D, = FD». 


Proof Since FD, is a subspace of V containing D2, we know by Proposition 3.5 
that F D2 C FD,. Conversely, any linear combination of elements of D; over F 
is also a linear combination of elements of Dz over F and so F'D; C FD», thus 
establishing equality. 


In particular, we note that FD = F(FD) for any subset D of V. 


Proposition 3.7 (Exchange Property) Let V be a vector space over a field 
F and let v, w € V. Let D be a subset of V satisfying v € F(DU {w}) \ FD. 
Then w € F(D U {v}). 


Proof Since v € F(D U {w}) we know that there exist elements vj,..., vy, of D 
and scalars aj,...,@,,b in F satisfying the condition that v = poe ajvj + bw. 
Moreover, since v ¢ FD, we know that b 40 and sow =b-!v — ¥“"_, baju; € 
F(D U{v}). 


A vector space V over a field F is finitely generated over F if it has a finite 
generating set. Finitely-generated vector spaces are often much easier to deal with 
by purely algebraic methods and therefore, in several situations, we will have to 
restrict our discussion to these spaces. 


Example If F is a field and n is a positive integer, then one sees that 
1 0 0 


0 1 0 
0 ; 0 pees 0 is a finite generating set for F” over F, and so F” 
0 0 1 


is finitely generated over F’. More generally, if V is a vector space finitely generated 
over a field F, say V = F{v,..., vg}, and if n is a positive integer, then 
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v1 v2 Uk 0 0 0 
0 0 0 VI v2 Uk 
0 0 0 0 0 0 

0 0 0 

0 0 0 

V1 v2 Uk 


is a generating set for V” over F having kn elements. 


Example If F is a field and if k and n are positive integers, then the vector space 
Mkxn(F) of all k x n matrices over F is finitely generated over F’. Similarly, if V 
is a finitely-generated vector space over F,, then the vector space Mxxn(V) is also 
finitely generated over F. 


Example For any field F, the vector space F is not finitely generated over F. 


Example The field R is finitely generated as a vector space over itself, but is not 
finitely generated as a vector space over Q. 


Let V be a vector space over a field F. In Proposition 3.4, we saw that if 
{W; |i € Q} is a collection of subspaces of V then ();-¢ W; is a subspace of V. 
In the same way, we can define the subspace }°;-¢ W; of V to be the set of all 
vectors in V of the form vi <A Wj, Where A is a finite nonempty subset of §2 and 
w ; € Wj; for each j € A. In other words, }0;-9 Wi = F(Ujeg Wi). Indeed, from 
the definition of this sum, we see something stronger: if D; is a generating set for 
W; for each i € 2 then Veg Wi = F(U; eg Di). 

As a special case of the above, we see that if W; and W2 are subspaces of V, 
then W; + W2 equals the set of all vectors of the form w; + w2, where w; € Wi 
and w2 € W2. If both W; and W? are finitely generated then W; + W? is also finitely 
generated. By induction, we can then show that if W,,..., W,, are finitely-generated 
subspaces of V, then )~?_, W; is also finitely generated. 


Proposition 3.8 [f V is a vector space over a field F and if {Wj |i € 82} isa 

collection of subspaces of V,, then: 

(1) Wy is a subspace of Vieg W; forallhe 22; 

(2) If Y is a subspace of V satisfying the condition that Wp, is a subspace of 
Y for allh € Q, then )o;<¢9 Wi is a subspace of Y. 


Proof (1) is clear from the definition. As for (2), if we have a subspace Y satisfying 
the given condition, if A is a finite subset of 2, and if w; ¢ W; for each j € A, then 
w; €Y for each j and so }),-,wj € Y. Thus )ijcg Wi CY. 
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Proposition 3.9 If V is a vector space over a field F and if W,, W2, and W3 
are subspaces of V, then: 

(1) (Wi + W2) + W3 = Wi + (Wo + W3); 

(2) Wij+Wo=W24+W); 

(3) W3 0 [W2 + (W129 W3)) = (W129 W3) + (W229 W3); 

(4) (Modular law for subspaces): If W, © W3 then 


W311 (W2 + Wi) = Wi + (W229 W3). 


Proof Parts (1) and (2) follow immediately from the definition, while part (4) is 
a special case of (3). We are therefore left to prove (3). Indeed, if v belongs 
to W310 [W2 + (W1M W3)], then we can write v = w2 + y, where w2 € W2 and 
y € W, 1 W3. Since v, y € W3, it follows that w2 = v — y € W3, and so v= 
y + w2 € (W, ON W3) + (W2 O W3). Thus we see that W3M [W2 + (W1 M W3)] S 
(W101 W3) + (W2M W3). Conversely, assume that v € (W1 1 W3) + (W2M W3). Then, 
in particular, v € W3 and we can write v = w1 + w2, where w, € W1 1 W3 and w2 € 
W201 W3. Thus v = wy + w2 € W3N W24+ (WN W3). This shows that (W; N W3) + 
(W219 W3) C W310 [Wo + (W129 W3)], and so we have the desired equality. 
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Exercise 56 
Is it possible to define on V = Z/(4) the structure of a vector space over GF(2) 
in such a way that the vector addition is the usual addition in Z/(4)? 


Exercise 57 

Consider the set Z of integers, together with the usual addition. If a € Q and 
k € Z, define a-k to be |a|k, where |a]| denotes the largest integer less than or 
equal to a. Using this as our definition of “scalar multiplication”, have we turned 
Z into a vector space over Q? 


Exercise 58 

Let V = {0, 1} and let F = GF(2). Define vector addition and scalar multiplica- 
tion by setting v + v’ = max{v, v’}, Ov = 0, and lv = v for all v, v’ € V. Does 
this define on V the structure of a vector space over F'? 


Exercise 59 
Let p > 2 and let V be a vector space over GF(p). Show that v 4 —v for all 
Ov AveEV. 


Exercise 60 
Let V = C(0, 1). Define an operation on V by setting fH gixp 
max{ f (x), g(x)}. Does this operation of vector addition, together with the usual 
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operation of scalar multiplication, define on V the structure of a vector space 
over R? 


Exercise 61 

Let V be a nontrivial vector space over R. For each v € V and each complex 
number a + bi, let us define (a + bi)v = av. Does V, together with this new 
scalar multiplication, form a vector space over C? 


Exercise 62 

Let J be the unit interval [0,1] on the real line and let V = Rx/. Define op- 
erations of addition and scalar multiplication on V as follows: (a, 5) + (b, t) = 
(a +b, min{s, t}) and c- (a, s) = (ca, s). Is V a vector space over R? 


Exercise 63 

Let V = {i € Z|0 <i < 2”} for some given positive integer n. Define operations 
of vector addition and scalar multiplication on V in such a way as to turn it into 
a vector space over the field GF(2). 


Exercise 64 

Let V be a vector space over a field F. Define a function from GF(3) x V to 
V by setting (0, v) & Oy, C1, v) b v, and (2, v)  —v for all v € V. Does this 
function, together with the vector addition in V, define on V the structure of a 
vector space over GF(3)? 


Exercise 65 
Give an example of a vector space having exactly 125 elements. 


Exercise 66 
Let V = Q”, with the usual vector addition. If a + b/2 € Q(V2) and if 


c 5) c |_| ac+2bd é 2: 
4 €Q, set (a+b/2) H a bewa . Do these operations turn Q* into 


a vector space over Q(/2)? 


Exercise 67 

Let V = RU {oo} and extend the usual addition of real numbers by defining 
v+o=00+v=08 forall v € V. Is it possible to define an operation of scalar 
multiplication on V in such a manner as to turn it into a vector space over R? 


Exercise 68 P ; j 
Let V =R?. || ‘ 7 € V andr ER, set | + Fr = es ‘| and 


r | = k 7 os . Do these operations define on V the structure of a vector 


space over R? If so, what is the identity element for vector addition in this space? 
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Exercise 69 

Let V =R and let o be an operation on R defined by a o b = a°b. Is V, together 
with the usual addition and “scalar multiplication” given by o, a vector space 
over R? 


Exercise 70 
Show that Z is not a vector space over any field. 


Exercise 71 
Let V be a vector space over the field GF(2). Show that v = —v forallue V. 


Exercise 72 
In the definition of a vector space, show that the commutativity of vector addition 
is a consequence of the other conditions. 


Exercise 73 
Let W be the subset of R° consisting of all vectors an odd number of the entries 
in which are equal to 0. Is W a subspace of R°? 


Exercise 74 
Let F be a field and fix 0 < k € Z. Let W be the subset of F” consisting of all 
those functions /f satisfying 


k-1 
fE+HD=> fE4) 


j=0 


for each i € Z. Is W a subspace of F7? 


Exercise 75 
a 

Let W be the subset of R? consisting of all vectors | b | satisfying |a|+|b| = |cl. 
c 


Is W a subspace of R?? 


Exercise 76 

Let V = R® and let W be the subset of V containing the constant function x +> 0 
and all of those functions f € V satisfying the condition that f(a) = 0 for at most 
finitely-many real numbers a. Is W a subspace of V? 


Exercise 77 
ay ay bj 


Let V= : O<a,eRy.Ifv=| : | andw=J| : | belong to V, and 
as as bs 
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a,b, a 
ifc eR, setv+w= : andcv=J| : |. Do these operations turn V into 

asbs as 
a vector space over R? 


Exercise 78 
How many elements are there in the subspace of GF(3)> generated by 


1 2 


21,/2] 7? 

1 1 
Exercise 79 
A function f € R® is piecewise constant if and only if it is a constant function 
xt» c or there exist a) < az < +++ <a, andcg,..., Cy, in R such that 


co (ifx <a, 
fixe yq ifa; <x <aj4, forl<i<n, 
Cn ifay, <x. 


Does the set of all piecewise constant functions form a subspace of the vector 
space R® over R? 


Exercise 80 

Let V be the vector space of all continuous functions from R to itself and let W 
be the subset of all those functions f € V satisfying the condition that | f(x)| < 1 
for all —1 <x < 1.Is W a subspace of V? 


Exercise 81 
ay 


Let W be the subspace of V = GF(2)° consisting of all vectors : | satisfying 
a5 
a aj =0. Is W a subspace of V? 


Exercise 82 
Let V = R™ and let W be the subset of V consisting of all monotonically- 
increasing or monotonically-decreasing functions. Is W a subspace of V? 


Exercise 83 
Let V = R® and let W be the subset of V consisting of the constant function 
at 0, and all epic functions. Is W a subspace of V? 


Exercise 84 

Let V = R® and let W be the subset of V containing the constant function a +> 0 
and all of those functions f € V satisfying the condition that f(7) > f(—zr). Is 
W asubspace of V? 
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Exercise 85 

Let V = R® and let W be the subset of V consisting of all functions f satisfying 
the condition that there exists a real number c (which depends on /) such that 
| f(a)| <cla| for all a € R. Is W a subspace of V? 


Exercise 86 

Let V = R® and let W be the subset of V consisting of all functions f satisfying 
the condition that there exist real numbers a and b such that | f (x)| < a| sin(x)|+ 
b| cos(x)| for all x => 0. Is W a subspace of V? 


Exercise 87 
Let F be a field and let V = F”,, which is a vector space over F. Let W be the 
set of all functions f € V satisfying f(1) = f(—1). Is W asubspace of V? 


Exercise 88 

For any real number 0 < ¢ < 1, let V; be the set of all functions f € RE satisfying 
the condition that if a < b in R then there exists a real number u(a, b) satisfying 
| f(x) — f(y)| < u(a, b)|x — y|! for all a < x, y < b. For which values of t is V; 
a subspace of R™? 


Exercise 89 
Let U be a nonempty subset of a vector space V. Show that U is a subspace of 
V if and only if au+u' €U for allu,u’ €U andae F. 


Exercise 90 

Let V be a vector space over a field F and let v and w be distinct vectors in V. 
Set U = {(1 —t)v+tw |t € F}. Show that there exists a vector y € V such that 
{u + y|u€U} isa subspace of V. 


Exercise 91 
Let V be a vector space over a field F and let W and Y be subspaces of V?. Let 


0) eae 2s F 
U be the set of all vectors | € V* satisfying the condition that there exists a 


" 


vector v” € V such that | € W and | € Y.Is U a subspace of V7? 


Exercise 92 

Consider R as a vector space over Q. Given a nonempty subset W of R, let 
W be the set of all real numbers b for which there exists a sequence a1, d2,... 
of elements of W satisfying limj..0a; = b. Show that W is a subspace of R 
whenever W is. 


Exercise 93 

Let V be a vector space over a field F and let P be the collection of all sub- 
sets of V, which we know is a vector space over GF(2). Is the collection of all 
subspaces of V a subspace of P? 
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Exercise 94 
Let W be the set of all functions f € RN satisfying the condition that if f(i) £0 
then f(ji) £0 for all positive integers j. Is W a subspace of RN? 


Exercise 95 
Let W be the set of all functions f € RN satisfying the condition that if f(i) =0 
then f(ji) =0 for all positive integers j. Is W a subspace of RN? 


Exercise 96 


Let V be a vector space over a field F and let Y be the set of all matrices of the 


VI v2 Ov 

form} Oy viy+u2 Oy | inM3,3(V). Is Y a subspace of M3,.3(V)? 
Ov VI v2 

Exercise 97 


Let W be the set of all functions f € R® satisfying the following conditions: 
there exist positive real numbers a and b such that for all x € R satisfying |x| >a 
we have | f(x)| < b|x|. Show that W is a subspace of RE. 


Exercise 98 
Let W be a subspace of a vector space V over a field F’.. Is the set (V \ W) U {Oy} 


necessarily a subspace of V? 


Exercise 99 

Let V be a vector space over a field F and let f be a function from V to the 
unit interval [0, 1] on the real line satisfying the condition that f(au + bv) > 
min{ f(u), f(v)} for all a,b € F and all u,v € V. Show that f(Oy) > f(v) for 
all v € V and that if0<h < f(y) then V, = {ve V | f(v) = h} is a subspace 
of V. 


Exercise 100 
Consider the elements f, g,h of Q@ defined by f: tte t—1,g:ttet+1,and 
h:tt> t? +1. Does the function t + 72 belong to Q{ f, g, A}? 


Exercise 101 


1 1 
Let F = GF(3) and let D= 1],} 0 . For which scalars c is | 1 | a 
0 2 Cc 


linear combination of elements of D? 
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Exercise 102 


4 
Find a real number c such that | 3 | € R°? is a linear combination of 
1 
3 —1 
1], 2 
Cc 1 


Exercise 103 
Find subsets D and D’ of R? such that R(DN D’) ARDORD’. 


Exercise 104 
Find subspaces W and Y of R? having the property that W U Y is not a subspace 
of R?. 


Exercise 105 
Let V be a vector space over a field F and let Oy 4 w € V. Given a vector 
véV\ Fu, find the set G of all scalars a € F satisfying F{v, w} = F{v, aw}. 


Exercise 106 
Let p be a prime integer and let V be a vector space over F = GF(p). Show that 
V is not the union of k subspaces, for any k < p. 


Exercise 107 

Let V be a vector space over a field F and let c and d be fixed elements of F. 
Define a new operation H on V by setting v Hv’! =cv+dv’. Is V, with this new 
vector addition and the old scalar multiplication, still a vector space over F’? 


Exercise 108 

Let J be the closed unit interval [0, 1] on the real line. A function (a,b) Bh aob 
from I x I to I is a triangular norm! if and only if the following conditions hold 
foralla,b,ceT: 


(1) aol=a; 
(2) a<c implies thataob<cob; 
(3) aob=boa; 


(4) ao(boc)=(aob)oc. 

Given a vector space V over a field F, and given a triangular norm o on 7, 
a function f : V > I is a o-fuzzy subspace of V if and only if, for each v, w € V 
and each d € F, we have f(u+w) > f(v)o f(w) and f(dv) => f(v). Find 
a condition that a o-fuzzy subspace f of V must satisfy for the set {v € V | 
J (v) = a} to be a subspace of V for any ae I. 


‘Triangular norms play a very important part in the theory of probabilistic metric spaces and have 
important applications in statistics and in mathematical economics, as well as such areas as pattern 
recognition and capacity theory. 
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Exercise 109 

Let V be a vector space over a field F and let D be a nonempty subset of V. 
A zero-sum combination of elements of D is an element of the form ye Ajj. 
where the v; are elements of D and the qj; are scalars satisfying }“'_, aj = 0. The 
set z(D) of all zero-sum combinations of elements of D is called the zero-sum 
hull of D. Is it true that z(z(D)) = z(D)? Is z(D) necessarily a subspace of V? 


Exercise 110 

Let V be a vector space over a field F and let D be a nonempty subset of V. 
A uniform combination of elements of D is an element of the form )°7_, ajvi, 
where the v; are elements of D and a,j =---=a,. The set u(D) of all uniform 
combinations of elements of D is called the uniform hull of D. Is it true that 
u(u(D)) = u(D)? Is u(D) necessarily a subspace of V? 


Exercise 111 
If we identify R* with the Euclidean plane in the usual way, and if v 4 w are two 
vectors in R*, show that affh({v, w}) is the line passing through these two points. 


Exercise 112 

If we identify R* with the three-dimensional Euclidean space in the usual way 
and if v,w,y are distinct vectors in R? which are not collinear, show that 
affh({v, w, y}) is the plane determined by the three points. 


Exercise 113 
Let V be a vector space over a field F and let D be a subset of V containing Oy. 
Show that affh(D) is a subspace of V. 


Algebras Over a Field 


In general, a vector space does not carry with it the notion of multiplying two vectors 
in the space to produce a third vector. However, sometimes such multiplication may 
be possible. A vector space K over a field F is an F-algebra if and only if there 
exists a function (v, w) + vew from K x K to K such that 

(1) ue(vu+w)=uevt+uew; 

(2) utvjew=uew+veun; 

(3) a(vew) = (av) ew=ve (aw) 

for allu, v, w € K anda eé F. As in the proof of Proposition 2.3(1), these conditions 
suffice to show that 0x ev =ve0Ox =Ox forallve K. 

Note that the operation e need not be associative, nor need there exist an identity 
element for this operation. When the operation is associative, i.e. when it satisfies 
(4) ve(wey)=(vew)ey 
for all v, w, y € K, then the algebra is called an associative F-algebra. If an iden- 
tity element for e exists, that is to say, if there exists an element Ox #e € K satis- 
fying vee=v=eev forall v € K, we say the F-algebra K is unital. In a unital 
F-algebra, as with the case of fields, the identity element must be unique. In this 
case, we can then identify F’ with the subset {ae | a € F} of K and we note that 
aev=vedforallve K andae F. 

If v is an element of an associative F-algebra (K,e) and if n is a positive in- 
teger, we write v” instead of ve --- ev (n factors). If K is also unital and has a 
multiplicative identity e, we set v° = e for all Ox 4 uv € K. The element (0x)” is 
not defined. 

If ve w=w ev for all v and w in some F-algebra K, then the algebra is com- 
mutative. An F-algebra (K, e) satisfying the condition that v e w = —w ev for all 
v, w € K is anticommutative. If the characteristic of F is other than 2, it is easy to 
see that this condition is equivalent to the condition that ve v= Ox for allve K. 
Of course, in that case K cannot possibly be unital. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 39 
DOI 10.1007/978-94-007-2636-9_4, © Springer Science+Business Media B.V. 2012 


40 4 Algebras Over a Field 


With kind permission of the Harvard University Archives, HUP (B. Peirce and C.S. Peirce); With kind per- 
mission of the American Mathematical Society (Dickson); With kind permission of the Bryn Mawr College 
Library, Special Collections (Noether). 


The first systematic study of associative algebras was initiated by the nineteenth-century 
American mathematician Benjamin Peirce and continued by his son, the mathematician 
and logician Charles Sanders Peirce. Other major contributors at the beginning of the 
twentieth century were the American mathematician Leonard Dickson, the Scottish math- 
ematician Joseph Henry Wedderburn, and the German mathematician Emmy Noether, 
generally known as “the mother of modern algebra”. 


If (K,e) is an associative unital F-algebra having a multiplicative identity e, 
and if v € K satisfies the condition that there exists an element w € K such that 
vew=wevr=e, then we say that v is a unit of K. As with the case of fields, such 
an element w, if it exists, is unique and is usually denoted by v~!. If v is a unit, then 
so is —v, for one immediately notes that (—v)~! = —(v~!). Also, it is easy to see 
that if v and w are units of K, then so is v e w. Indeed, 

(vew)e (wo! e v') = (v e (w e w')) ev! 


1 1 


=(veejev =vev =e 


and similarly (wlev !)e(vew)=e,so(vew) !=w lev !.IfveK isa 
unit and if n > 1 is an integer, we write v~” instead of (v—!)”. Note that the Hua’s 
identity (Proposition 2.6) in fact holds in any associative unital F-algebra in which 
the needed inverses exist, since the proof relies only on associativity of addition and 
multiplication and distributivity of multiplication over addition. 

If (K,e) is an F-algebra, and v, w € K, then (v, w) forms a commuting pair 
if and only if ve w= w ev. Of course, if the algebra K is commutative, all pairs 
of elements commute, but in general that will not be the case. Note that if (v, w) 
is a commuting pair in a unital associative F-algebra (K,e) and v~! exists, then 
(v—!, w) is also a commuting pair. Indeed, (vt ewlev=v 'e(wev)=vu le 
(vew)=wsowev !=[(vlew)evlev! =v lew. 

Example Any vector space V over a field F can be turned into an associative and 
commutative F'-algebra which is not unital by setting v e w= Oy forallv,weV. 


Example If F is a subfield of a field K, then K has the structure of an associa- 
tive F'-algebra, with multiplication being the multiplication in K. Thus, C is an 
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R-algebra and Q(,/p) is a Q-algebra for every prime integer p. These algebras are, 
of course, unital. 


Example Let F be a field, let (K,e) be an F-algebra, and let 2 be a nonempty 
set. Then the vector space K® of all functions from 2 to K has the structure of 
an F-algebra with respect to the operation e defined by feg:it» f(i) e g(i)for 
all i € 82. This F-algebra is associative if K is. If K is unital with multiplicative 
identity element e, then K® is also unital, with identity element given by the con- 
stant function i +> e. In particular, if F is a field and if §2 is a nonempty set then 
F® is an associative unital F-algebra with respect to the operation e defined by 
feg:it f@Mg@Mfor allie 2. 


Example We have seen that the collection of all subsets of a given nonempty set 
A is a vector space over GF(2). It is in fact an associative and commutative unital 
GF(2)-algebra with respect to the operation M. The identity element with respect to 
this operation is A itself. 


Example Define an operation « on C(O, 00) by setting 


t 
fegies | fit-—u)gtu)du. 
0 


This turns C (0, oo) into an associative and commutative R-algebra, known as the 
convolution algebra on R. 


Example Let K be the vector space over R consisting of all functions in R® which 
are infinitely differentiable, and define an operation e on K by setting feg =(fg)’ 
(where ’ denotes differentiation). Then (K,e) is an algebra which is commutative 
but not associative. 


The collection of all operations e on a vector space V over a field F which turn 
V into an F-algebra will be studied in more detail in Chap. 20. 

Let F be a field. If (K,e) is an F-algebra, then a subspace L of K satisfying 
the condition that we w’ € L for all w, w’ € L is an F-subalgebra of K. If (K, e) 
is a unital F-algebra, then L is a unital subalgebra if it contains the multiplicative 
identity element of K. 


Let F be a field. An anticommutative F-algebra (K, e) is a Lie algebra over F 
if and only if it satisfies the additional condition 


(Jacobi identity) ue(vew)+ve(weu)+we(uev)=0x; 


for all u, v, w € K. This algebra is not associative unless ue v = Ox forallu,ve K. 
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With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

Sophus Lie was a nineteenth-century Norwe- 
gian mathematician who developed mathematical 
concepts that provide the basic model for quan- 
tum theory and an important tool in differential 
geometry. They were independently defined by 
the nineteenth-century German teacher Wilhelm 
Wilhelm Karl Joseph Killing Dudley Littlewood Karl Joseph Killing, in connection with his work 
on non-Euclidean geometry. Another pioneer in the study of noncommutative algebras be- 
cause of their importance in physics was the twentieth-century British mathematician Dud- 
ley Littlewood. 


Example Let F be a field and let (K, *) be an associative F-algebra. Define a new 
operation e on K by setting vew=vu*« w—w*v. Then (K,e) is a Lie algebra 
over F’, which is usually denoted by K~. The operation in K~ is known as the Lie 
product defined on the given F'-algebra K. This example is very important because 
one can show that any Lie algebra over a field F can be considered as a subalgebra 
of a Lie algebra of the form K~ for some associative F'-algebra K. (A proof of this 
result, known as the Poincaré—Birkhoff—Witt Theorem, is far beyond the scope of 
this book.) If v, w € K, then v e w = Ox precisely when v * w = w * v, in other 
words, precisely when (v, w) forms a commuting pair in (K, *). 


Lie algebras are of fundamental importance in the modeling problems in physics, 
and have many other applications; they are in the forefront of current mathematical 
research. One particular Lie algebra defined on R? goes back to the work of Grass- 
mann. Define the structure of an R-algebra on R* with multiplication x given by 


ay by azb3 — a3b2 
a2 )\}x | bo | =| ab) —a1b3 
a3 b3 abz — azby 


This operation, called the cross product, has very important applications in physics 
and engineering. It is easy to check that the algebra (IR?, x) is a Lie algebra over R. 


1 0 0 
Note that if vj] = | 0}, v2 =] 1 |, and v3 = | O |, then, surely, vj x v2 = 
0 0 1 


U3, UI X V2 = V3, and v3 X Vj = V2. Moreover, the cross product is the only possi- 
ble anticommutative product which can be defined on R? and which satisfies this 
condition. Indeed, if e is any such product defined on R? then 
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3 
ajbj (vj ev;) 


2) -[] (Sen) (San) 3 


a3 b3 j=l i=1 j=l 
anb3 — azb2 a bj 
= a3b, — a,b; =] 42 x bo 
a,b2 — arb, a3 b3 
0 
Proposition 4.1 /f v and w are nonzero elements of R?, then v x w = | 0 
0 
if and only if Rv = Rw. 
a by 
Proof Suppose v = | a2 | and w= | b2 |. These vectors are nonzero and so one 
a b3 


of the entries in w is nonzero; without loss of generality, we can assume that bj 4 0. 
Then anb3 = a3b2 = a3b, = a,b3 = ayb2 =i anb, = 0 and so, if we define c = a\b;', 
we have v = cw. Hence v € Rw. Moreover, c #0 so w=c7!v € Rv, proving the 
desired equality. Conversely, if Rv = Rw then there exists an 0 4d € R such that 

0 
w=dv. Thenv x w=d(v x v) = | 0 

0 


The cross product is very particular to the vector space R?, and does not gener- 
alize easily to spaces of the form R” for n > 3, with the exception of n = 7, which 
we will see in a later chapter. 


An important non-associative algebra is the following: let F be a field of charac- 
teristic other than 2, and let (K, *) be an associative algebra. We can define a new 
operation e on K, called the Jordan product, by setting ve w = 5(v *W+W*v), 
Then (K, e) is acommutative F-algebra, usually denoted by KT, called the Jordan 
algebra defined by K. It is not associative in general, but does satisfy 


(Jordan identity) (vew)e(vev)=ve (w e(ve v)) 


for all v, w € K. Jordan algebras have important applications in physics. Note that if 
vxwW = wx, then vew = v*w. This observation will have important consequences 
later. In particular, if (K,*) is unital with multiplicative identity e, then ee v = 
vee=v forall v € K, so K™ is also unital. 


44 4 Algebras Over a Field 


© the estate of Friedrich Hund. Repro- 
duced with kind permission of Gerhard 
Hund (Jordan); With kind permission 
of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Jacob- 
son); With kind permission of the Amer- 
ican Mathematical Society (Albert). 


Jordan algebras were developed 
by the twentieth-century German 
physicist Pascual Jordan, one of the fathers of quantum mechanics and quantum electrody- 
namics. The algebraic structure of Lie algebras and Jordan algebras was studied in detail by 
the twentieth-century American mathematicians Nathan Jacobson and A. Adrian Albert. 


We now come to an extremely important algebra. Let F be a field and let X 
be an element not in F’, which we will call an indeterminate. A polynomial in X 
with coefficients in F is a formal sum f (X) = par a; X', in which the elements a; 
belong to F’, and no more than a finite number of these elements differ from 0. The 
elements qj; are called the coefficients of the polynomial. If all of the a; equal 0, then 
the polynomial is called the 0-polynomial. Otherwise, there exists a nonnegative 
integer n satisfying the condition that a, ~ 0 and a; = 0 for alli > n. The coefficient 
ay is called the leading coefficient of the polynomial; the integer n is called the 
degree of the polynomial, and is denoted by deg(/). If the leading coefficient of a 
polynomial is 1, the polynomial is monic. The degree of the 0-polynomial is defined 
to be —co, where we assume that —oo <i for each integer i and (—oo) +i = —0oo 
for all integers i. If f(X) is a polynomial of degree n 4 —oo, we often write it 
as 079 aiX ‘. The set of all polynomials in X with coefficients in F is denoted 
by F[X]. We identify the elements of F with the polynomials of degree at most 0, 
and so can consider F as a subdomain of F[X]. We can associate the 0-polynomial 
with the identity element 0 of F for addition and the polynomials of degree 0 with 
the nonzero elements of F and so, without any problems, consider F as a subset 
of F[X]. 


Example The polynomials 5X*+2X?+1 and 5X? — X?+ X+4 in Q[X] both have 
degree 3 and leading coefficient 5. Therefore, they are not monic. The polynomials 
X34+2X*+1 and X3 — X?2+ X +4 in Q[X] are both monic and have the same 
degree 3. 


We define addition and multiplication of polynomials over a field as follows: 
if f(X) = ya X! and g(X) = 729; X! are polynomials in F[X], then 
FS (X) + g(X) is the polynomial y 4 c,X', where c; = a; + b; for all i > 0 and 
f (X)g(X) is the polynomial 5°72 d; X', where d; = aaa ajbj—; for alli > 0. It 
is easy to verify that these definitions turn FX] into an associative and commutative 
unital F-algebra with the 0-polynomial acting as the identity element for addition 
and the degree-0 polynomial | acting as the identity element for multiplication. This 
algebra is an integral domain, that is, the product of two nonzero elements of F'[X] 
is again nonzero. In general, an algebra having this property is said to be entire. 
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Thus, commutative, associative, entire, unital F'-algebras are integral domains. The 
converse of this is not true: Z is an integral domain which is not an F-algebra for 
any field F'. Not every commutative and associative unital R-algebra is entire. In- 
deed, the functions f : at» max{a, 0} and g : at max{—a, 0} are both nonzero 
elements of the R-algebra R'~!+"!, but their product is the 0-function. 

If f(X) = oa X! and g(X) = o>) bX! are polynomials in F[X] then we 
define the polynomial f(g(X)) to be pal ajg(X)!. Then, for any fixed g(X), the 
set F[g(X)] = {f(g(X)) | f(X) € F[X]} is a unital subalgebra of F[X]. 

Note that every polynomial in F[X] is a linear combination of elements of the 
set B = {1, X, X2,...} over F, so B isa set of generators of F[X] over F’. On the 
other hand, it is clear that no finite set of polynomials can be a generating set for 
F[X] over F,, and so F[X] is not finitely generated as a vector space over F. 

We should remark that the formal definition of multiplication of polynomials 
does not translate into the fastest method of carrying out such multiplication in prac- 
tice on a computer, especially for polynomials of large degree. The problem of fast 
polynomial multiplication has been the subject of extensive research over the years, 
and many interesting algorithms to perform such multiplication have been devised. 
A typical such algorithm is Karatsuba’s algorithm, which is easy to implement on 
a computer: let f(X) and g(X) be polynomials in F[X], where F is a field. We can 
write these polynomials as f(X) = )-"_, a; X! and g(X) = 7") b; X', where n is 
a nonnegative power of 2 satisfying n > max{deg(/), deg(g)}. (Of course, in this 
case a, and b, may equal 0.) We now calculate f(X)g(X) as follows: 


(1) Ifn=1 then f(X)g(X) = ab, X? + (agbi + a, bo)X + agbo. 
(2) Otherwise, write f(X) = f\(X)X"/* + fo(X) and 


g(X) = g1(X)X"/* + go(X), 


where the polynomials fo(X), fi(X), go(X), and g1(X) are all of degree at 
most 1/2. 
(3) Recursively, calculate fo(X)go(X), fi(X)g1(X), and 


(fo + fi)(X)(g0 + 81)(X). 


(4) Then 


F(X) g(X) = X"(figi(X) + X"7[(fo + fi(80 + 81) — fogo — figi|(X) 
+ (fogo)(X). 


Indeed, if the multiplication of two polynomials of degree at most n using the defi- 
nition of polynomial multiplication takes an order of 2n? arithmetic operations (i.e., 
additions and multiplications), it is possible to prove that there exists a fixed positive 
integer c such that the multiplication of two polynomials of degree at most n using 
Karatsuba’s algorithm takes at most cn!? arithmetic operations. If n is sufficiently 
large, the difference between these two bounds can be significant. 
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The main idea of Karatsuba’s algorithm lies in the recursive reduction of the de- 
grees of the polynomials involved. The method of recursive reduction has since been 
extended to fast algorithms in many other areas of mathematics. We will encounter 
it again when we consider the Strassen—Winograd algorithms for matrix multiplica- 
tion. 


With kind permission of Ekatherina Karatsuba. 


Anatoli Alexeevich Karatsuba is a contemporary Russian mathe- 
matician whose research is primarily in number theory. 


There are other highly-sophisticated algorithms for multiplying two polynomials 
of degree at most n in an order of n log() arithmetic operations. 


Proposition 4.2 (Division Algorithm) /f F is a field and if f (X) and g(X) # 
0 are elements of F[X], then there exist unique polynomials u(X) and v(X) 
in F[X] satisfying f (X) = g(X)u(X) + v(X) and deg(v) < deg(g). 


Proof Assume that f(X) = )-?29a;X! and g(X) = °° b; X! are the given poly- 
nomials. If f(X) = 0 or if deg(f) < deg(g), choose u(X) = 0 and v(X) = f(X), 
and we are done. Thus we can assume that n = deg(f) > deg(g) =k, and will 
prove our result by induction on n. If n = 0 then k = 0, and therefore we can 
choose u(X) to be ayb-', which is a polynomial of degree 0, and choose v(X) 
to be the 0-polynomial. Now assume, inductively, that n > O and that the proposi- 
tion has been established for all functions f(X) of degree less than n. Set h(X) = 
F(X) — anb, |X" g(X). If this is the 0-polynomial, choose u(X) = anb, | X"—* 
and let v(X) be the 0-polynomial. Otherwise, since deg(f) > deg(h), we see by the 
induction hypothesis that there exist polynomials v(X) and w(X) in F[X] satisfying 
h(X) = g(X)w(X) + v(X), where deg(g) > deg(v). Thus f(X) = [aja x + 
w(X)]g(X) + v(X), as required. 

We are left to show uniqueness. Indeed, assume that f(X) equals g(X)u,(X) + 
vj(X) and g(X)u2(X) + v2(X), where deg(v;) < deg(g) and deg(v2) < deg(g). 
Then g(X)[u1(X) — u2(X)] + [v1 (X) — v2(X) 1g (X) [1 (X) — u2(X)] + [v1 (X) — 
v2(X)] equals the 0-polynomial. If we have u;(X) = u2(X) then v1 (X) = v2(X), 
and we are done. Therefore, assume that uw; (X) 4 u2(X). But then, since deg(g[u1 — 
uj]) > deg(v; — v2) and since F[X] is entire, this is a contradiction. Thus we have 
established uniqueness. 


Let us emphasize that the set F[X] is composed of formal expressions and not 
functions. Every polynomial f(X) = )(?2p)a;X' € F[X] defines a corresponding 
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polynomial function in F* givenby ct f(c) = eae a,c! , but the correspondence 
between polynomials and polynomial functions is not bijective. Indeed, it is possi- 
ble for two distinct polynomials to define the same polynomial function. Thus, for 
example, if F = GF(2) then the distinct polynomials X, X 2x 3 ... all define the 
same function from F to itself, namely the function given by 0+ 0 and 1 +> 1. The 
degree of a polynomial function is the least of the degrees of the (perhaps many) 
polynomials which define that function. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The first person to systematically consider the best methods of calcu- 
© lating f(c) for a polynomial f(X) € FLX] and for c € F, was the 
twentieth-century Russian mathematician Alexander Markovich Os- 
trowski. 


Let p(X) and p2(X) be polynomials in FLX] and let c € F. If we set f(X) = 
P\(X) + p2(X) and g(X) = p1(X) p2(X) then it is clear that f(c) = pi(c) + p2(c) 
and g(c) = pi(c) p2(c). 


Proposition 4.3 Let F be a field and let p(X) be a polynomial in F(X]. 
Then an element c of F satisfies the condition that p(c) = 0 if and only if 
there exists a polynomial u(X) € F[X] satisfying p(X) = (X — c)u(X). 


Proof By Proposition 4.2, we know that there exist polynomials u(X) and v(X) in 
F[X] satisfying p(X) = (X — c)u(X) + v(X), where deg(v) < deg(X —c) = 1. 
Therefore, v(X) = b for some b € F. If b= 0 then p(c) = (c — c)u(c) = 0. 
Conversely, if p(c) = 0 then 0 = p(c) = (c — c)u(c) +: b = b and so p(X) = 
(x —c)u(X). 


As an immediate consequence of this result, we see that if F is a field and if 
D(X) € F[X], then the set of all elements c of F satisfying p(c) = 0 is finite and, 
indeed, cannot exceed the degree of p(X). 

Let F be a field. A polynomial p(X) € F[X] is reducible if and only if there 
exist polynomials u(X) and v(X) in F[X], each of degree at least 1, satisfying 
D(X) = u(X)v(X). Otherwise, the polynomial is irreducible. Many tests for the 
irreducibility of polynomials in Q[X] have been devised. One of the earliest and 
well-known is Eisenstein’s criterion: if p(X) = Y~"_) a; X' € Q[X], where each a; 
is an integer, and if there exists a prime integer g such that g does not divide ay, 
q divides a; for all O <i <n —1, and qe does not divide ag, then p(X) is irre- 
ducible. (A proof of this can be found in books on abstract algebra.) Thus, using this 
criterion, we see that 3X? + 7X? + 49X —7 is an irreducible polynomial in Q[X]. 
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Gauss’ brilliant student, Ferdinand Eisenstein, died of tuberculosis 
at the age of 29. 


Example If F = GF(5) then the polynomial X 34. +1 € F[X] is irreducible, a fact 
which can be established, if necessary, by testing all possibilities. However, when 
F = GF(3) it is easy to verify the factorization X3 + X + 1 = (X +2)(X?+ X +2), 
and thus see that the polynomial is reducible. 


Example If p(X) =u(X)v(X) in F[X], then surely p(X +c) =u(X +c)u(X +c) 
for any c € F, and so to prove that a polynomial p(X) is irreducible it suffices to 
prove that p(X + c) is irreducible for some c € F’. For example, let g be a prime in- 
teger. The gth cyclotomic polynomial in Q[X] is defined to be @g(X) = aa xe. 
We claim that this polynomial is irreducible. To see that this is so, we observe that 
Oy (X +1) =XI1+ = (aX which is irreducible by Eisenstein’s crite- 
rion. 


It is known that the number of monic irreducible polynomials of positive degree 
m in GF(p) equals N(p) = ps ys u(d) p™/4, where the sum ranges over all integers 
d which divide m and the Mobius function 1(d) is defined by 


1 ifd=1, 
w(d) = + (— 1k if d is the product of k distinct primes, 
0 otherwise. 


This means that the probability of a randomly-selected monic polynomial of degree 
m in GF(p)[X] being irreducible is N(p)/p”™, which is roughly ,. In particular, 
we note that for every positive integer m there exists at least one monic irreducible 
polynomial of degree m in GF(p). 

Any polynomial in F[X] can be written as a product of irreducible poly- 
nomials. How to find such a decomposition, especially in the case of poly- 
nomials over a finite field or over Q, is a very difficult and important prob- 
lem, which attracted such great mathematicians as Newton and which contin- 
ues to attract many important mathematicians until this day. Indeed, the prob- 
lem of factoring polynomials over finite fields into irreducible components has 
become even more important, since it is the basis for many current crypto- 
graphic schemes. There are algorithms, such as Berlekamp’s algorithm, which 
factor a polynomial f(X) € F[X], where F = GF(p”), in a time polynomial 
in p, n, and deg(f). Moreover, under various assumptions, such as the General- 
ized Riemann Hypothesis, polynomials of special forms can be factored much more 
rapidly. 
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A polynomial p(X) € F[X] of positive degree is completely reducible if and 
only if it can be written as a product of polynomials in F[X] of degree 1. Not every 
polynomial over every field is completely reducible. For example, the polynomial 
X? + 1 € Q[X] is not completely reducible. The field F is algebraically closed if 
every polynomial of positive degree in F[X] is completely reducible. The fields Q 
and R are not algebraically closed. The field C is algebraically closed, by a theorem 
known as the Fundamental Theorem of Algebra. This theorem is in fact analytic and 
not algebraic, and relies on various analytic properties of functions of a complex 
variable. Most of the great mathematicians of the eighteenth century—d’ Alembert, 
Euler, Laplace, Lagrange, Argand, Cauchy, and others—tried in vain to prove this 
theorem. The first proof was given by Gauss in his doctoral thesis in 1799. His proof 
was basically topological and relied on the work of Euler. During his lifetime, Gauss 
published several proofs of this theorem. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

A “nearly algebraic” proof was given by Ger- 
man/American mathematician Hans Zassenhaus 
in 1969. Most proofs of the Fundamental Theorem 
of Algebra are existence proofs and do not give a 
constructive method of finding the degree-one fac- 
tors of a polynomial over an algebraically-closed 
field. The first constructive proof was given by the German mathematician Helmut Kneser 
in 1940. 


Example The field F = GF(2) is not algebraically closed since the polynomial 
X? + X +1 € F[X] is not completely reducible. 


Note that if a field F is algebraically closed then every polynomial function 
F — F defined by a polynomial of positive degree is epic. Indeed, let p(X) € F[X] 
be a polynomial of positive degree and let d € F. Then g(X) = p(X) —d isa poly- 
nomial of positive degree in F[X] and so there exists an element c of F such that 
q(c) = 0. In other words, p(c) = d. 

It is easy to see that a polynomial p(X) € R[X] of degree | is irreducible. If 
D(X) = aX? +bX +cisof degree 2 then, considering it as an element of C[X], we 
have 


(Xx) X+ i + : b? —4 X e + : b? —4 
=a — — — 4a a ‘ 
. 2a 2a ° 2a 2a : 


Then this factorization holds in R[X] as well if and only if b* — 4ac > 0, and so 
p(X) is irreducible if and only if b* — 4ac < 0. From the following result we deduce 
immediately that there are no irreducible polynomials in R[X] of degree greater 
than 2. 
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Proposition 4.4 A monic polynomial p(X) € R[X] is irreducible if and only 
if it is of the form X — a or (X —a)* + b*, wherea€ RandO#bDER. 


Proof Clearly, every polynomial of the form X — a is irreducible. Now assume that 
f(X) =(X —a) +b? = X? —2aX +a? +b’. Were this polynomial reducible, we 
could find real numbers c and d satisfying 


f(X) = (X —e)(X -d) =X*-(C4+aX +ed 


and so c +d = 2a and cd = a* + b?. This implies that c* — 2ac + a” + b* = 0 and 
hence c = 5[2a +./4a? — 4(a? + b?)] =a+V—b?, which contradicts the assump- 
tion that c € R since b is assumed to be nonzero. Thus polynomials of both of the 
given forms are indeed irreducible. 

Conversely, let p(X) = )7j_9 ciX ‘ be a monic irreducible polynomial in R[X] 
that is not of the form X — a. By the Fundamental Theorem of Algebra, we know 
that there exists a complex number z = a + bi satisfying p(z) = 0. Since the 
coefficients of p(X) are real, this means that p(z) = 0 as well, since 0 = 0= 
p@)= eae ciz' = p(Z). Thus there exists a polynomial u(X) € R[X] satisfying 
p(X) = (X — z)(X — Z)u(X), where (X — z)(X —Z) = X*-(4+DX+2Zz= 
X? —2aX +a* +b’. Since p(X) was assumed irreducible, we conclude that z ¢ R 
(i.e., b £0) and that p(X) equals X* — 2aX + a? + b?, as desired. 


An obvious generalization of the above construction is the following: Let F be a 
field and let (K, e) be an associative and commutative unital F-algebra. If X is an 
element not in K, we can define a polynomial with coefficients in K as a formal sum 
f(O= ae a; X', in which the elements a; belong to K and no more than a finite 
number of them differ from Ox. The set of all such polynomials will be denoted 
by K[X]. As above, we define addition and multiplication in K[X] as follows: if 
F(X) = ya: X! and g(X) = yb; X' belong to K[X], then f(X) + g(X) 
is the polynomial ¥ c,X', in which c; = a; + b; for each 0 <i < ov, and 
f (X)g(X) is the polynomial )°?°, d; X', in which d; = ae, a; e bj—; for each 
0 <i < om. Again, it is easy to check that K[X] is an F-algebra. Moreover, as a 
direct consequence of the definition of multiplication, we see that if K is entire then 
so is K[X]. This generalization allows us to consider algebras of polynomials in sev- 
eral commuting indeterminates with coefficients in K defined inductively by setting 
K[X,..., Xn] = K[X1,..-, Xn—1][Xn] for each n > 1. Elements of this algebra 
are of the form 


if gig SS) Cie Ree 


where the sum ranges over all n-tuples (i,...,i,) of nonnegative integers and 
at most finitely-many of the coefficients a;,_;, € K are nonzero. The degree of 
f(X%,..., Xn) is the maximal value of 


fi) tes Fin | diy,..in F O}- 
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A polynomial a ere ere ey in K[X,,...,Xn] is flat if and only if 


ai # 0 only when each i; is either 0 or 1. 


seeoln 


Proposition 4.5 Let F be a field of characteristic other than 2 and let K be an 


associative and commutative unital entire F-algebra. Let f(X\,...,Xn) = 
Yea a + Xi" € K[X1,..., Xn] be a flat polynomial of degree n. Then 
for each n-tuple (c,,...Cn) of nonzero elements of K there exist e1,...,@n in 
K, each equal to 1K or —1x, such that f (e1c1,..., nen) #9. 


Proof We will prove this result by induction on n. If n = 1, then f(X1) =a, X1 + 
ao, where aj #0. If Ox #c € K then either ap + aic or ag — ajc is nonzero, for 
otherwise we would have 2a;c = 0x, which is impossible since K is entire and 
the characteristic of F is not 2. Hence the case n = | has been established. Now 
assume that n > | and that the proposition has been established for flat polynomi- 
als in F[X,,..., Xn—-1]. We can write the polynomial f(X1,..., X,) in the form 
g(X1,..., Xn—-1) $A(X1,..., Xn—1) Xn, where h(X1,..., Xn—1) is a flat polyno- 
mial in K[X\,..., Xn—1] of degree n — 1. If (ci,..., cn) is an n-tuple of nonzero 
elements of K then, by the induction hypothesis, we can find e],...,é,—1 in K, 
each equal to 1x or —1x, such that h(e,c1,...,@n—1Cn-1) # 0. But then we have 
g(eici,.--,;@n—1Cn—1) + h(eici,..-,@n—1Cn—-1) Xn € K[Xp] and so, by the case 
n= 1, we can find e, equal to 1x or —1x such that f(ejc1,..., @nC,) #0. 


Exercises 


Exercise 114 
Let F be a field and let (K,e) and (L, *) be F-algebras. Define an operation © 


/ 


/ 
on K x L by H © l*| = peel Is (K x L, ©) an F-algebra? 


Exercise 115 
Let F be a field and let (K, e) be a unitary, associative, commutative, and entire 


F-algebra which, as a vector space, is finitely generated over F. Is K necessarily 
a field? 


Exercise 116 

Let F be a field and let (K, e) be an associative F -algebra which, as a vector 
space, is finitely generated over F. Given an element a € K, do there necessarily 
exist elements a, a2 € K satisfying aj ea2 =a? 


Exercise 117 


Define an operation e on R? by setting | ® i = ee al Show that 


this operation turns R* into an R-algebra. Is this algebra associative? 
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Exercise 118 
Let F be a field and let (K, e) be a unital F-algebra. Define an operation ¢ on 
the vector space L = K x F by setting | © ; a aa iy ad for 


allv, w eK anda,be F. Is L an F-algebra? Is it unital? 


Exercise 119 

Let F be a field. An F-algebra (K, e) is a division algebra if and only if for 
every v € K and for every Ox # w € K there exist unique vectors x, y € K, not 
necessarily equal, satisfying w ex = v and ye w = v. Is the algebra defined in 
the previous exercise a division algebra? 


Exercise 120 
Let K be the subset of M4 x4(R) consisting of all matrices of the form 
a —b -c —-d 


: : - - for a,b,c,d € R and let L be the subset of M4,.4(R) 
d -c b a 
a —b -c -d 
bis : b a d — ; 
consisting of all matrices of the form ad 4 pI: Are K and L uni- 


d c —b a 
tal subalgebras of M4,4(R)? Are they division algebras? 


Exercise 121 

Let F be a field and let (K, e) be an associative unital F'-algebra with multiplica- 

tive identity e. For units v, w € K, show that: 

(1) ve !+uw))=W+uw)jew!; 

(2) (v+w)!ew=v !e(v-!+w!)~! whenever v + w and v-! + w7! are 
also a units; 

(3) vew t+te=ve(v'+w). 


Exercise 122 

Let F be a field and let (K,e) be an associative F-algebra which, as a vector 
space, is finitely generated over F. Suppose that there exists an element y € K 
satisfying the condition that for each v € K there exists an element v’ € K satis- 
fying v’ e y=v. Show that each such element v’ must be unique. 


Exercise 123 

Let F be an infinite field and let (K,*) be an associative unital F-algebra. 
If v, w € K, show that there are infinitely-many elements w’ of K satisfying 
vew=vew ink. 


Exercise 124 
Let F be a field and let (K, e) be an associative unital F-algebra. If A and B are 
subsets of K, we let Ae B be the set of all elements of K of the form ae b, with 
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aeéAand be B (in particular, 0 e B= Ae @ = @). We know that the set V of 
all subsets of K is a vector space over GF(2). Is (V, e) a GF(2)-algebra? If so, is 
it associative? Is it unital? 


Exercise 125 

Let F be a field and let (K, e) be an associative F-algebra. If V and W are sub- 
spaces of K, we let V e W be the set of all finite sums of the form a 1 Ui @ Wj, 
with v; € V and w; € W. Is V e W necessarily a subspace of K? 


Exercise 126 

Let (K, e) be an associative F-algebra and let v € K. If there exists an element 
y of K satisfying v e y e v = v, show that there also exists an element w of K 
satisfying vewev=vandwevew=w. 


Exercise 127 
For v, w € R?, simplify the expression (v + w) x (U—w). 


Exercise 128 
For u,v, w € R?, simplify the expression (u-++uv+w) x (v+w). 


Exercise 129 
Let F be a field and let (K,e) be an F-algebra satisfying the Jacobi identity. 
Show that K is a Lie algebra if and only if ve v=Ox forallve K. 


Exercise 130 

Let F be a field and let (K, *) be an associative F'-algebra. For each Op #c € F 
and define an operation e, on K by setting ve, w =c(v * w+ w * v). For which 
values of c is (K, e) a Jordan algebra over F? 


Exercise 131 

Let F be a field and let (K, e) be a unitary F-algebra. For each v € K, let S(v) 
be the set of all a € F satisfying the condition that v — alx does not have an 
inverse with respect to the operation e. If v € K has a multiplicative inverse v~! 
with respect to this operation, show that either S(v) = @ = § (v—!) or S(v) 4D 
and S(v—!) = {a“!|a € S(v)}. 


Exercise 132 
Let F be a field and let L be the set of all polynomials f(X) € FLX] satisfying 
the condition that f(—a) = — f(a) for alla € F. Is L a subspace of F[X]? 


Exercise 133 
Let F be a field and let L be the set of all polynomials f(X) € F[X] satisfying 
the condition that deg(f) is even. Is L a subspace of F[X]? 
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Exercise 134 
Let F be a field and let f(X),g(X) € F[X]. Show that deg(fg) = 
deg(f) + deg(g). 


Exercise 135 
Let F be a field and let f(X), g(X) € F[X]. Show that deg(f + g) 
max{deg(f), deg(g)}, and give an example in which we do not have equality. 


IA 


Exercise 136 
Find polynomials u(X), v(X) € Q[X] satisfying 


X443X3 = (X* 4X4 l)u(X) + v(X). 


Exercise 137 
Let F = GF(2). Find polynomials u(X), v(X) € F[X] satisfying 


X° 4X? = (XP +X +t l)u(X) + v(X). 


Exercise 138 
Let F = GF(7). Find a nonzero polynomial p(X) € F[X] such that the polyno- 
mial function defined by p is the 0-function. 


Exercise 139 
Is the polynomial 6X4 + 3X? + 6X? + 2X +5 € GF(7)[X] irreducible? 


Exercise 140 
Is the polynomial X 74 X44 1 € Q[X] irreducible? 


Exercise 141 
Find t € R such that there exist a,b € R satisfying a + b = 1 and 2a? — a — 
Ta +t=0=2b3 —b* —7b+t. 


Exercise 142 
For a field F, compare the subsets F[X?] and F[X? + 1] of FLX]. 


Exercise 143 

Let F = GF(p), where p is a prime integer, and let g be an arbitrary function 
from F to itself. Show that there exists a polynomial p(X) € F[X] of degree less 
than p satisfying the condition that g(c) = p(c) forall ce F. 


Exercise 144 
Let c be a nonzero element of a field F and let n > 1 be an integer. Show that 
there exists a polynomial p(X) € F[X] satisfying c” + c~” = p(c+c7!). 


Exercises 55 


Exercise 145 
Let F be a field. Find the set of all polynomials 0 4 p(X) € F[X] satisfying 
P(X?) = p(X)’. 


Exercise 146 
Let p(X) = aX (X —1)---(X¥ —k+ 1) € Q[X] for some positive integer k. 
Show that p;(n) € Z for every nonnegative integer n. 


Exercise 147 
Let pn(X) =nX"t! — (n+ 1)X" +1 € Q[X] for any positive integer n. Show 
that there exists a polynomial g,(X) € Q[X] satisfying py, (X) = (X — 1)74n (X). 


Exercise 148 

Let F be a field and let W be a nontrivial subspace of the vector space F[X] 
over F. Let p(X) € F[X] be a given monic polynomial and let p(X)W = 
{ p(X) f (X) | f(X) € W}. Show that p(X)W is a subspace of F[X] and find 
a necessary and sufficient condition for it to equal W. 


Exercise 149 
Let p be a prime integer and let n be a positive integer. Does there necessarily 
exist an irreducible monic polynomial in GF(p)[X] of degree n? 


Exercise 150 

Let p be a prime integer and let n be a positive integer. Show that the product of 
all irreducible monic polynomials in GF(p)[X] of degree dividing n is equal to 
xP" _ x, 


Exercise 151 
Let n > 1 be an integer. Is the polynomial p(X) = 1+ )7)_, max € Q[X] nec- 
essarily irreducible? 


Exercise 152 
Show that the polynomial X* + 1 is irreducible in Q[X] but reducible in 
GF(p)[X] for every prime p. 


Exercise 153 
Show that X* + 2(1 — c)X* + (1 + c)* € Q[X] is irreducible for every c € Q 
satisfying /c ¢Q. 


Exercise 154 

Let F be a field and let K = FN. Define operations + and e on K by setting 
f+g:ite fd +e@ and feg:it> V4.2; f()g(k). Show that K is an 
associative and commutative unital F-algebra. Is it entire? 
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Exercise 155 

Let k be a positive integer and let a < b be real numbers. A function f € R'“?! is 
a spline function of degree k if and only if there exist real numbers a = ay <--- < 
an = b and polynomials po(X),..., Pn—1(X) of degree k in R[X] satisfying 
the condition that f:x t+» p;(x) for all aj < x <aj4; andallO<i<n-1. 
Spline functions play an important part in interpolation theory and in numerical 
procedures for solving differential equations. Is the set of all spline functions of 
fixed degree k a subspace of the vector space R!@-?!? 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Spline functions were first defined and studied by the twentieth- 
century Romanian/American mathematician Isaac Jacob Schoen- 
berg. 


Exercise 156 
Let F be a finite field, let k > 1 be an integer, and let V be the vector space over 


F consisting of all polynomials in F[X] having degree less than k. Let aj, ..., ay 
be distinct elements of F and let W be the subset of F'” consisting of all vectors 
pay) 
of the form : for some p € V. Is W a subspace of F;,? 
P(an) 


Exercise 157 

A trigonometric polynomial in R® is a function of the form t +> aj + 
i [ay cos(ht) + by sin(ht)], where ag, ..., ax, b1,..., be € R. Show that the 
subset of R® consisting of all trigonometric polynomials is an entire R-algebra. 
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In this chapter, we will see how a restricted collection of vectors in a vector space 
over a field can dictate the structure of the entire space, and we will deduce far- 
ranging conclusions from this. Let V be a vector space over a field F. A nonempty 
subset D of V is linearly dependent if and only if there exist distinct vectors 
Uj,---,U, in D and scalars a,,...,d, in F, not all of which are equal to 0, sat- 
isfying )~_, ajvj = Oy. A list of elements of V is linearly dependent if it has two 
equal members or if its underlying subset is linearly dependent. Clearly, any set of 
vectors containing Oy is linearly dependent. A nonempty set of vectors which is not 
linearly dependent is linearly independent. That is to say, D is linearly independent 
if and only if D= @ or D4 @ and we have )~"_, aj vj = Oy with the a; in F and 
the v; in V, when and only when a; = 0 for all 1 <i <n. As a consequence of this 
definition, we see that an infinite set of vectors is linearly dependent if and only if it 
has a finite linearly-dependent subset, and an infinite set of vectors is linearly inde- 
pendent if and only if each of its finite subsets is linearly independent. It is also clear 
that any set of vectors containing a linearly-dependent subset is linearly dependent 
and that any subset of a linearly-independent set of vectors is linearly independent. 


With kind permission of The Shelby White and Leon Levy Archives Center, USA. 


The notion of linear independence of vectors was introduced by Grass- 
mann; it was extensively generalized to other mathematical contexts 
the by the twentieth-century American mathematician Hassler Whit- 
ney. 


1 —l —4 

Example The subset 214 3 3 7 of Q? is linearly dependent 
1 4 11 
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0 1 —1 —4 
since | 0} = (-1)}| 2] + 3 3) + (-1) 7 |. Similarly, the subset 
0 1 4 11 


1 1 
O;,) 14, of Q? is linearly independent, since if 
0 0 
0 1 1 
O/=a|/0/+5]1]4+ec] 1], 
0 0 0 1 
0 a+b+c 
then | 0 | = b+ec and this implies that a = b=c=0. 
0 c 
1 1 0 
1 0 1 
0 1 1 
Example The subset 1}),;1)],] 1 of GF(2)’ is linearly independent and 
1 0 0 
0 1 0 
0 0 1 


generates a subspace of V composed of eight vectors: 


and 


oO 565 6o © co 6 6 
SORF ORR 
ORF OrFFR OR 
RP OORRFFRO 
SS = CO CO © 
=e COCO = 
= Of OF Oe 
OrPrROrRFROS 


Note that in every element of V other than its identity element for addition, a ma- 
jority of the entries are nonzero. This property makes this subspace of V important 
in algebraic coding theory. 


Example Let b > 1 be a real number, let {p1, p2,...} be the set of prime integers 
and, for each 7, let u; = log,(p;). We claim that D = {u,u2,...} is a linearly- 
independent subset of R when it is considered as a vector space over Q. Indeed, 
assume that this is not the case. Then there are a positive integer n and rational 
numbers dj,...,@, not all equal to 0, satisfying ae aju; = 0. If we multiply 
both sides by the product of the denominators of the a;, we can assume that the a; 
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are integers. Then 


n n n 
l= p° = pr aitti = | [ee = [[@")“ = [ [pi. 
i=l 


i=l i=l 


and this is a contradiction. Therefore, D must be linearly independent. 


Example Let F be a field and let §2 be a nonempty set. Let V; be a vector space 
over F for each i € £2, and set V = I]; <q Vi. We have already seen that the iden- 
tity for addition in this vector space is the function go : 2 > U;eg Vi given by 
go: it Oy,. Foreachi € Q, let fj : 2 > Uieg Vi be a function satisfying the con- 
dition that fj (7) € go(i) but f; (A) = go(A) forall h € 82 ~ {i}. We claim that the sub- 
set { fj | i € 2} of V is linearly independent. To see this, assume that there exists a 
finite subset A of §2 and a family of scalars {c;, | h € A} such that hed chfh = 80- 
Then for each k € A we have go(k) = (ope, Ch fn (K) = Veneg Ch talk) = ck fi (kK) 
and since, by definition, f,(k) 4 go(k), we must have cy, = 0. 


Example If F is a field, the subset {1, X, X?,...} of F[X] is surely linearly inde- 
pendent, since }~_. a; X' = 0 if and only if each of the coefficients a; equals 0. 


Example Let V = R* be the vector space, over R, of all functions from R to itself. 
Let D be the set of all functions of the form x +> e“* for some real number a. We 
claim that D is linearly independent. Indeed, assume that there are distinct real num- 
bers a,...,@, and real numbers c1,..., c, such that the function x th i cje"* 
equals the 0-function fo : x +» 0, which is the identity element of V for addition. 
We need to show that each of the c; equals 0, and this we will do by induction on n. 

If nm = 1 then we must have c; = 0 since the function x > e“ is different from 
fo for each a € R. Assume therefore that n > 1 and that every subset of D having 
no more than n — | elements is linearly independent. For each 1 <i <n, set bj = 
aj — ay. Then 


n n—1 
jae? ae =| ge | he, 
i=l i=l 


and if we differentiate both sides of the equation, we see that fo = at bicjeP*. 
By the induction hypothesis and the choice of the scalars a; as being distinct, it 
follows that bic; =0 4 b; for each 1 <i <n—J1andsoc; =O forall l<i<n-—1. 
This in turn implies that c, = 0 as well. 

Similarly, let G be the subset of V consisting of all of the functions of the form 
gi :x +> x'~!2*-!, We claim that this set too is linearly independent. Indeed, as- 
sume otherwise. Then there exists a positive integer n and there exist real numbers 
C1,-++5n, such that )7?_, cig; = fo. But this implies that 2*~'()~"_, cix'~!) =0 
for each real number x. Since 2*—! # 0 for each x € R, we conclude that 
yo, cix'! = 0 for all x. But the polynomial function x + 7"_, cix!~! from 
R to itself has infinitely-many roots if and only c; = 0 for all i, proving linear inde- 
pendence. 
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Note that if {v, w} is a linearly-dependent set of vectors in an anticommutative 
algebra (K,e) over a field of characteristic other than 2, then there exist scalars a 
and b, not both equal to 0, such that av + bw = Ox. Relabeling if necessary, we can 
assume that b 4 0. Then 0x =a(vev)+b(vew)=b(vew) andsovew=Ox. 
A simple induction argument shows that if D is a linearly-dependent subset of K 
then v, e--- ev, =Ox for any finite subset {v),..., ug} of D. 

Note too that Proposition 3.7 can be easily iterated to get the more general result 
that if D is a nonempty subset of a vector space V over a field F and if B is a finite 
linearly-independent subset of F D having k elements, then there exists a subset D’ 
of D also having k elements satisfying the condition that F((D \ D’) U B) = FD. 
Moreover, if D is linearly independent, so is (D \ D’) U B. This result is sometimes 
known as the Steinitz Replacement Property. 


Proposition 5.1 Let V be a vector space over a field F. A nonempty subset 
D of V is linearly dependent if and only if some element of D is a linear 
combination of the others over F.. 


Proof Assume D is linearly dependent. Then there exists a finite subset {v1,..., Un} 
of D and scalars a),..., an, not all of which equal 0, satisfying )°"_, aju;j = Oy. 
Say an #0. Then vz = —a;," Doieh ajv; and so we see that vy, is a linear combi- 


nation of the other elements of D over F’. Conversely, assume that there is some 
element of D is a linear combination of the others over F’. That is to say, there is 
an element v; of D, elements v2,...,v, of D \ {v,} and scalars a2,...,d, in F 
satisfying vj = )~/_, ajv;. If we set ay = —1, we see that )~"_, ajvj = Oy and so 
D is linearly dependent. 


Example For every real number a, let f, be the function in R® defined by fy xt 
|x — a|. We claim that the subset D = { f, | a € R} of R® is linearly independent. 
Indeed, assume that this is not the case. Then there exists a real number D such that 
fp is a linear combination of other members of D. In other words, there exist a 
finite subset E of R \ {b} and scalars cg for each a € E such that fy = ooep Ca fa- 
But the function on the right-hand side of this equation is differentiable at b, while 
the function on the left-hand side is not. From this contradiction, we see that D is 
linearly independent. 


If A is anonempty set, then a relation < between elements of A is called a partial 
order relation if and only if the following conditions are satisfied: 
(1) axa foralla€é A; 
(2) Ifa =<bandb <a thena=bD; 
(3) Ifa <b and b =c thena ~c. 
The term “partial” comes from the fact that, given elements a and b of A, it may 
happen that neither a = b nor b <a. A set on which a partial order has been defined 
is a partially-ordered set. A partially-ordered set A satisfying the condition that for 
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all a,b € A we have either a = b or b Xa is called a chain. A nonempty subset 
B of a partially-ordered set A is itself partially-ordered relative to the partial order 
relation defined on A; it is a chain subset if it is a chain relative to the partial order 
defined on A. 

If A is a nonempty set on which we have a partial order relation = defined, then 
an element ao of A is maximal in A if and only if a9 x a when and only when 
a = ao. Anelement a, is minimal if and only if a < a; when and only when a = a}. 
Maximal and minimal elements need not exist or, if they exist, need not be unique. 
The Well Ordering Principle, one of the fundamental axioms of number theory, says 
that any nonempty subset of N, ordered with the usual partial order, has a minimal 
element. This principle is equivalent to the principle of mathematical induction. 

Partial order relations are ubiquitous in mathematics, and often play a very impor- 
tant, though not usually highlighted, part in the analysis of mathematical structures. 


Example Let A be a nonempty set and let P be the collection of all subsets of A. 
Define a relation < between elements of P by setting B = B’ if and only if B C B’. 
It is easy to verify that this is indeed a partial order relation. Moreover, P has a 
unique maximal element, namely A, and a unique minimal element, namely @. The 
set P is not a chain whenever A has more than one element since, if a and b are 
distinct elements of A, then {a} Z {b} and {b} Z {a}. 


Example Let A = {1, 2,3} and let P be the collection of all subsets of A having one 
or two elements. Thus P has six elements: {1}, {2}, {3}, {1,2}, {1,3}, and {2, 3}. 
Again, the relation < between elements of P defined by setting B =< B’ if and only 
if B C B’ is a partial order relation. Moreover, P has three minimal elements: {1}, 
{2}, and {3}; it also has three maximal elements: {1, 2}, {1, 3}, and {2, 3}. 


In general, if we have a collection of subsets of a given set, the collection is 
partially-ordered by setting B < B’ if and only if B C B’. Therefore, it makes sense 
for us to talk about “a minimal generating set” of a vector space V—namely a 
minimal element in the partially-ordered collection of all generating sets of V— 
and about “a maximal linearly-independent subset” of a vector space V—namely 
a maximal element of the partially-ordered collection of all linearly-independent 
subsets of V. However, we have no a priori guarantee that such minimal or maximal 
elements in fact exist. 


Example Consider the set A of all integers greater than 1, and define a relation = 
on A by setting k = n if and only if there is a positive integer ¢ satisfying n = fk. 
This is a partial order relation on A. Moreover, A has infinitely-many minimal ele- 
ments, since each prime integer is a minimal element of A, while it has no maximal 
elements, since n = 2n for eachn € A. 
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Proposition 5.2 Let V be a vector space over a field F . Then the following 
conditions on a subset D of V are equivalent: 

(1) D is a minimal set of generators of V; 

(2) D is a maximal linearly-independent subset of V; 

(3) D is a linearly-independent set of generators of V. 


Proof (1) => (2): Let D be a minimal set of generators of V, and assume that D is 
linearly dependent. By Proposition 5.1, there exists an element vg € D which is a 
linear combination of elements of the set E = D ~ {vo} over F. Say vo = wa 1 4ili, 
where the u; belong to E and the qa; are scalars in F’. If v is arbitrary element of 
V then, since D is a set of generators of V, there exists elements 11,...,u, of 
E and scalars bo, bj,..., by such that v = pee, b;v;. But this then implies that 
v = bovo + jn bj vj = Vins boaiui + YS b;v; and so E is also a set of gen- 
erators of V, contradicting the minimality of D. This establishes the claim that D is 
linearly independent. If v € V \ D, the set D U {v} is linearly dependent since v is 
a linear combination of elements of D. Thus D is a maximal linearly-independent 
set. 

(2) => (3): Assume that D is a maximal linearly-independent subset of V. 

Consider a vector vo in V \ D. By (2), we know that the set D U {vo} is lin- 
early dependent, and so Oy € F(D U {vo}) \ FD by Proposition 3.7, this implies 
vo € F(D U {0Ovy}) = FD, which proves that D is a set of generators of V. 

(3) => (1): Assume that D is a linearly-independent set of generators of V 
and that E is a proper subset of D which is also a set of generators for V. Let 
vg € D~ E. Then there exist elements v1,..., vu, of E and scalars aj,..., a, such 
that vp = )°7_, ajv;. But, by Proposition 5.1, this implies that the set D is linearly 
dependent, contradicting (3). Therefore, no such E exists and so D is a minimal set 
of generators of V. 


Proposition 5.3 Let V be a vector space over a field F and let D be a 
linearly-independent subset of V. If 9 € V \ FD then the set D U {uo} is 
linearly independent. 


Proof Assume that this set is linearly dependent. Then there exist elements 
V1,--+;U, Of D and scalars ao, a}, ..., An, not all equal to 0, such that y 4 Qj 0; 
= Oy. The scalar ag must be different from 0, for otherwise D would be linearly 
dependent, which is a contradiction. Therefore, vp = eae —dy lai vj € FD, which 
contradicts the choice of vo. Thus D U {vp} must be linearly independent. 


Proposition 5.3 has important implications. For example, let V be a vector space 
over a field F which is not finitely generated and let D = {v1,..., v,} be a linearly- 
independent subset of V. Then FD ¥ V, since V is not finitely generated, and so 
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there exists a vector v,4; € V \ FD such that {v1,..., v,41} is linearly indepen- 
dent. Thus we see that a vector space which is not finitely generated has linearly- 
independent finite subsets of arbitrarily-large size. 

A generating set for a vector space V over a field F which is also linearly inde- 
pendent, is called a basis of V over F. In Proposition 5.2, we gave some equivalent 
conditions for determining of a subset of a vector space is a basis. However, we have 
not yet proven that every (or, indeed, any) vector space must have a basis. 


1 0 0 1 1 1 
Example Clearly, O;,; 1 ),) 0 and O;,; 14], 1 are bases 
0 1 0 0 1 


of F? for any field F. If the characteristic of F is other than 2, then 
1 1 0 


1},;0],) 1 is a basis of F 3 but if the field F has characteristic 2, 
0 1 1 
1 0 
then the set is linearly dependent, since |} 1 |} +] O0]+]1]=] 0 
0 1 1 0 


Example Let F be a field and let both k and n be positive integers. For each 
1 <s <k andeach 1 <t <n, let Hs; be the matrix [a;;] in Mxxn(F) defined by 


oall HONS, 
4 “~)0 otherwise. 


Then {H,,|1<s <k and 1 <t <n} isa basis of Mgyn(F). 


Example If F is a field, then we have already seen that the subset {1, X, X 2. ...} of 
F[X] is a linearly-independent generating set for F[X] as a vector space over F,, and 
so is a basis of this space. The same is true for the subset {1, X + 1, X24X4+ | ees 
of F[X]. More generally, if {po(X), pi (X), ...} is a subset of F[X] satisfying the 
condition that deg(p;(X)) =i for all i > 0, then it is a basis of F[X] as a vector 
space over fF’. 


Since every element of a vector space V over a field F has a unique representa- 
tion as a linear combination of elements of a basis, if one wants to define a structure 
of an F-algebra on V it suffices to define the product of any pair of basis elements, 
and then extend the definition by distributivity and associativity. This is illustrated 
by the following example, and we will come back to it again in Proposition 5.5. 


Example We have already noted that if F is a field then F[X] is an associa- 
tive F-algebra. Let us generalize this construction. Let H be a nonempty set on 
which we have defined an associative operation +. Thus, for example, H could 
be the set of nonnegative integers with the operation of addition or multiplica- 
tion. Let V be the vector space over F' with basis {vy | h € H} and define an op- 
eration e on V as follows: if v= een dgVg and w = hen bjvp, are elements 
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of V (where at most finitely-many of the ag and the by, are nonzero), then set 
vew= LigeH ncH AgbnVgxn. This turns V into an associative F-algebra. In the 
case H = {X' |i > 0}, we get F[X]. Such constructions are very important in ad- 
vanced applications of linear algebra. 


Note that a vector space may have (and usually does have) many bases 
and so the problem arises as to whether there is a preferred basis among all of 


these. For vector spaces of the form F”, there are reasons to prefer the basis 
1 0 0 


0 1 0 
0 : 0 gated 0 ; for vector spaces of the form Mx x,(F) there are rea- 
0 0 1 


sons to prefer the basis {Hs; | 1 <.s <k and 1 <t <n} defined above; and for vector 
spaces of the form F[X] there are reasons to prefer the basis {1, X, X7,...}. These 
bases are called the canonical bases of their respective spaces. However, in various 
applications—especially those involving large calculations—it is often convenient 
and sometimes extremely important to pick other bases which fit the problem un- 
der consideration. Indeed, in applications many considerations arise in choosing a 
basis D for a given vector space V. For example, we would like representation of 
elements of V as linear combinations of elements of the basis to be stable under 
perturbations of the coefficients. That is to say, if v= )~?_, ajvj, where the v; are 
elements of D, and if a; is a scalar near a; for each 1 <i <n, then we would 
like >i a;v; to be, in some sense, near v. (What “near” means here depends on 
notions of distance arising from the particular situation under consideration.) This 
is especially important if our data is based on observation or measurement which 
is not assumed to be entirely accurate. For instance, we might want to choose the 
basis taking into account the fact that the coefficient of vz is much more dubious 
than the coefficients of the other basis elements, or choose it so that all of the co- 
efficients aj be of the same numerical order of magnitude for those vectors v in 
which we are really interested and for which we will have to do extensive calcula- 
tion. 

It is also important to emphasize another point. When we defined the notation 
for this book, we stressed that when a set is defined by listing its elements, the set 
comes with an implicit order defined by that listing. When we deal with bases, and 
especially finite bases, the order in which the elements of the basis are written often 
plays a critical role, and one should never lose track of this. 


Proposition 5.4 Let V be a vector space over a field F and let D be a 
nonempty subset of V. Then D is a basis of V if and only if every vector in V 
can be written as a linear combination of elements of D over F in precisely 
one way. 
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Proof First, let us assume that D is a basis of V and that there exists an el- 
ement v of V which can be written as a linear combination of elements of D 
over F in two different ways. That is to say, that there exists a finite subset 
{v1,...,Un} of D and there exist scalars a},...,@n,b,,...,b, in F such that 


v—v= (Ly aiujy) — Oo biv;) = 7_ [ai — bi]v;, where at least one of the 
scalars a; — bj is nonzero. This contradicts the assumption that D is a basis and 
hence linearly independent. Therefore, every vector in V can be written as a linear 
combination of elements of D over F in precisely one way. 

Conversely, assume that every vector in V can be written as a linear combination 
of elements of D over F in precisely one way. That certainly implies that D is a 
generating set for V over F. If {vj,..., v,} is a subset of D and if aj,...,d, are 
scalars satisfying )~;_, ajvj = Oy, then we have )~"_, ajvj = )~/_, Ov; and so, by 
uniqueness of representation, we have a; = 0 for each | <i <n. This shows that D 
is linearly independent and so a basis. 


We can look at Proposition 5.4 from another point of view. Let D be a nonempty 
subset of a vector space V over a field F, and define a function 6 : F‘?) + V by 
setting 0: f > )),ep f (u)u. (This sum is well-defined since only finitely-many of 
the summands are nonzero.) Then: 

(1) The function @ is monic if and only if D is linearly independent; 
(2) The function 6 is epic if and only if D is a generating set; 
(3) The function @ is bijective if and only if D is a basis. 


Proposition 5.5 Let D be a basis for a vector space V over a field F. Then 
any function f : D x D — V can be extended in a unique manner to a func- 
tion V x V — V which defines on V the structure of an F -algebra. Moreover, 
all F -algebra structures on V arise in this manner. 


Proof Let D = {y; | i € 82}. Suppose that we are given a function f: Dx D— V. 
We define an operation e on V as follows: if v, w € V, then, by Proposition 5.4, we 
know that we can write v = )0;-9 aj yj and w = Vie bj; yj in a unique manner, 
where the a; and b; are scalars, only a finite number of which are nonzero; then set 
vew= Vico XY jeg Uj fi. yj). It is straightforward to show that this defines 
the structure of an F'-algebra on V. Conversely, if (V, e) is an F-algebra, define the 
function f: Dx D— V by f: (i, yj) ie yj. 


The function f in Proposition 5.5 is the multiplication table of the vector multi- 
plication operation e with respect to the basis D. 


Example Let F bea field and let a,b € F. Let B = {v1, v2, v3, v4} be the canonical 
basis for F* over F. Define an operation e on B according to the multiplication 
table: 
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e U1 v2 U3 U4 
aT U1 v1 v3 U4 
u2 u2 av} U4 av3 
U3 U3 —vU4 bv, —bv2 
v4 U4 —aVv3 bvz —abv; 


and extend this operation to F* by setting 


4 4 4 4 
Yo ain; ® > bju; = 5) oajbj(vj ev). 
j=l 


i=l i=1 j=l 


Then F*, together with this operation, is a unital associative algebra known as a 
quaternion algebra over F,, in which vj is the identity element of for multiplication. 
In the special case of F = R anda = b = —1, we get the algebra of real quaternions, 
which is denoted by H. The algebra of real quaternions was first defined by Hamil- 
ton in 1844 as a generalization of the field of complex numbers (and earlier studied 
by Gauss, who did not publish his results). It is a division algebra over R since every 
nonzero quaternion is a unit of H. These were subsequently generalized by Clifford 
and used in his study of non-Euclidean spaces. Lately, they have also been used in 
computer graphics and in signal analysis. If F is a field having characteristic p > 0, 
quaternion algebras over F are not even entire. However, they arise naturally in the 
theory of elliptic curves, and so are of great importance in cryptography. If p > 2, 
then no quaternion algebras over F are commutative. 


With kind permission of the Spe- 
cial collections, Fine Arts Library, 
Harvard University (Tait); With 
kind permission of the London 
Mathematical Society (Clifford). 
Sir William Rowan Hamil- 
ton, a nineteenth-century 
Irish mathematician and 
physicist, helped create ma- 
trix theory in its modern formulation, together with Cayley and Sylvester. Hamilton was 
the first to use the terms “vector” and “scalar” in an algebraic context. His championship of 
quaternions as an alternative to vectors in physics was later taken up by Scottish mathemati- 
cian Peter Guthrie Tait. The nineteenth-century British mathematician William Kingdon 
Clifford was one of the first to argue that energy and matter were just different types of 
curvature of space. 


We now show that any vector space over a field F' has a basis. Indeed, the fol- 
lowing two propositions show somewhat stronger than that. 
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Proposition 5.6 If V is a vector space finitely generated over a field F then 
every finite generating set of V over F contains a basis of V. 


Proof Let V be a vector space finitely generated over a field F and let D be a finite 
generating set for V over F’. If D is minimal among all generating sets for V, then 
we know by Proposition 5.2 that it is a basis of V. If not, it properly contains other 
generating sets for V over F’, one of which, say E, has the fewest elements. Then 
E cannot properly contain any other generating set for V over F’,, and so it must be 
a basis of V. 


Proposition 5.7 If V is a vector space finitely generated over a field F then 
every linearly-independent subset B of V is contained in a basis of V over F. 


Proof By assumption, there exists a finite generating set {v],..., Un} for V over F. 
Let B be a linearly-independent subset of V. If vu; ¢ FB for each 1 <i <n, then 
FB = V, and B is itself a basis of V. Otherwise, let h = min{i | v; ¢ FB}. By 
Proposition 5.3, the set D = B U {up} is linearly independent. If it is a generating set 
for V, then it is a basis and we are done. If not, let k = min{i | v; ¢ FD}, and replace 
D by BU {up, vg}. Continuing in this manner, we see that after finitely-many steps 
we obtain a basis of V. 


With kind permission of the Department of Mathematics, University of Torino, Italy. 


The Italian mathematician Giuseppe Peano, best known for his ax- 
iomatization of the natural numbers, was the first to prove that every 
finitely-generated vector space has a basis at the end of the nineteenth 
century. He also gave the final form for the definition of a vector space, 
which we used above. 


We now want to extend this result to vector spaces which are not finitely gener- 
ated, and to do so we have to make use of an axiom of set theory known variously 
as the Hausdorff Maximum Principle or Zorn’s Lemma. To state this principle, we 
need another concept about partially-ordered sets. Let A be a set on which we have 
defined a partial order =. A subset B of A is bounded if and only if there exists an 
element ao € A satisfying b = apo for all b € B. Note that we do not require that ag 
belong to B. The Hausdorff maximum principle then says that if A is a partially- 
ordered set in which every chain subset is bounded, then A has a maximal element. 
Again, this is not really a “principle” or a “lemma”; it is an axiom of set theory which 
has been shown to be independent of the other (Zermelo—Fraenkel) axioms one usu- 
ally assumes. Indeed, it is logically equivalent to the Axiom of Choice, which we 
mentioned in Chap. | as being somewhat controversial among those mathematicians 
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dealing with the foundations of mathematics. However, in this book, we will assume 
that it holds. Given that assumption, we can now extend Proposition 5.7. 


With kind permission of the Hausdorff Research Institute for 
Mathematics (Hausdorff); © Jens Zorn (Zorn). 

Felix Hausdorff, one of the leading mathemati- 
cians of the early twentieth century and one of the 
founders of topology, died in a German concen- 
tration camp in 1942. Max Zorn, a German math- 
ematician who emigrated to the United States, 
made skillful use of the Hausdorff Maximum 
Principle in his research, turning it into an impor- 
tant mathematical tool. 


Proposition 5.8 If V is a vector space over a field F then every linearly- 
independent subset B of V is contained in a basis of V. 


Proof Let B be a linearly-independent subset of V and let P be the collection of 
all linearly-independent subsets of V which contain B, which is partially-ordered 
by inclusion, as usual. Then P is nonempty since B € P. Let Q be a chain subset 
of P. We want to prove that Q is bounded in P. That is to say, we want to find a 
linearly independent subset E of V which contains every element of Q. Indeed, let 
us take E to be the union of all of the elements of Q. To show that E£ is linearly 
independent, it suffices to show that every finite subset of E is linearly independent. 
Indeed, let {v,,..., v,} be a finite subset of E. Then for each 1 <i <n, there exists 
an element D; of Q containing v;. Since Q is a chain, there exists an index h such 
that D; C Dy for all 1 <i <n and so vu; € Dy for all 1 <i <n. Therefore, this set is 
a subset of a linearly-independent set and so is linearly independent. Thus we have 
shown that every chain subset of P is bounded and so, by the Hausdorff maximum 
principle, the set P has a maximal element. In other words, there exists a maximal 
linearly-independent subset of V containing B, and this, as we know, is a basis of 
V over F. 


Taking the special case of B = © in Proposition 5.8, we see that every vector 
space has a basis. In the above proof we used the Axiom of Choice to prove this 
statement. In fact, one can show something considerably stronger: in the presence 
of the other generally-accepted axioms of set theory, the Axiom of Choice is equiv- 
alent, in the sense of formal logic, to the statement that every vector space over any 
field has a basis. 


5 Linear Independence and Dimension 69 


© A. Blass. 


The above result is due to the contemporary American mathematician, 
Andreas Blass. 


Example Consider the field R as a vector space over its subfield Q. A basis for 
this space is known as a Hamel basis. By Proposition 5.8, we know that Hamel 
bases exist, but nobody has been able to come up with a method of specifically 
constructing one. The subset C of R consisting of all real numbers which can be 
represented in the form pa. u;3~', where each u; is either 0 or 2, is called the 
Cantor set, and it can be shown to be “sparse” (in a technical sense of the word we 
won’t go into here) in the unit interval [0, 1] in R. It is possible to show that there is 
a Hamel basis of R contained in C. 

The existence of Hamel bases leads to some very interesting results, as the fol- 
lowing shows. Indeed, let H be a Hamel basis of R. If r € R then we can write 
r= Ygen qa(r)a, where qa(r) € Q and there are only finitely-many elements 
a € H for which qq(r) 4 0. Since such a representation is unique, we see that 


1 ifa=b, 
0 otherwise 


ga(b) = | 


for a,b € H. Moreover, if r,s € R anda € H then qa(r +5) = qa(r) + ga(s) 80, if 
a #b are elements of H then for any r € R we have ga(r + b) = qa(r) + qa(b) = 
da(r). Thus we see that the function g, € R® is periodic, with period b for any 
b € H ~ {a}, and its image is contained in Q. Moreover, if we pick two distinct 
elements c and d of H, we see that for each r € R, we have r = f(r) + g(r), where 
f.g €R® are defined by f :rt> qc(r)c and g:rt Does) qa(r)a. By our 
previous comments, f is periodic with period d and g is periodic with period c. 
We conclude that the identity function in R® is the sum of two periodic functions. 
A somewhat more sophisticated argument along the same lines shows that any poly- 
nomial function in R™ of degree n is the sum of n + 1 periodic functions. Of course, 
since we cannot specify H, there is no way of finding these periodic functions ex- 
plicitly. 


© Professor Richard von Mises. 


The twentieth-century German mathematician Georg Hamel was a 
student of Hilbert who worked primarily in function theory. In his later 
years, he became notorious for his pro-Nazi views and activities. 
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We have seen that a vector space over a field can have many bases. We want 
to show next that if the vector space is finitely generated, then all of these bases 
are finite and have the same number of elements. First, however, we must prove a 
preliminary result. 


Proposition 5.9 Let V be a vector space over a field F which is generated by 
a finite set B = {v,,..., Un} and let D be a linearly independent set of vectors 
in V. Then the number of elements in D is at most n. 


Proof Suppose that D has a subset FE = {w1,..., Wn41} having more than n ele- 
ments. Since this set must also be linearly independent, we know that none of the 
w; equals Oy. For each 1 <k <n, set Dy = {W1,..., Wk, Uk+1,--+5 Un}. 

Since B is a generating set for V, we can find scalars a), ..., dy, not all equal to 0, 
such that w; = }~"_, ajv;. In order to simplify our notation, we will renumber the 
elements of B if necessary so that a; 4 0. Then vj = a wi— dj» aya; v; and so 
DC FD,. But D; C V = FD and so V = FD, by Proposition 3.6. Now assume 
that 1 < k <n and that we have already shown that V = F Dx. Then there exist 
scalars bj,..., by, not all equal to 0, such that w+) = ye bjwi + pare bjV;. 
If the scalars by41,...b, are all equal to 0, then we have shown that D is linearly 
dependent, which is not the case. Therefore, at least one of them is nonzero and, by 
renumbering if necessary, we can assume that by; #0. Thus vz4) = ve pWkHI — 
ae bey bi Wi — ee Bedi v; and so, using the above reasoning, we get V = 
F Dg+1. Continuing in this manner, we see that after n steps we obtain V = FD, = 
F{wi,..., Wn}. But then wy+1 € F{wi,..., wy} and so E is linearly dependent, 
contrary to our assumption. This proves that D can have at most n elements. 


Proposition 5.10 Let V be a vector space finitely generated over a field F. 
Then any two bases of V have the same number of elements. 


Proof By hypothesis, there exists a finite generating set for V over F’ having, say, 
n elements. If B is a basis of V then, by Proposition 5.9, we know that B has at 
most n elements and so, in particular, is finite. Suppose B and B’ are two bases 
for V having h and k elements, respectively. Since B is linearly independent and 
B’ is a generating set, we know that h < k. But, on the other hand, B’ is linearly 
independent and B is a generating set, sok <h. Thush=k. 


We should remark at this point that the assertion for linearly-dependent sets cor- 
responding to Proposition 5.10 is not true. That is to say, a finite linearly-dependent 
set of vectors may have two minimal linearly-dependent subsets with different num- 
bers of elements. Indeed, there is no efficient algorithm to find such subsets of a 
given linearly-dependent set. Minimal linearly-dependent sets of vectors are often 
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called circuits because of applications to graph theory. We should also note that 
Proposition 5.9 is a special case of a more general theorem: If V is a vector space 
(not necessarily finitely generated) over a field F then there exists a bijective func- 
tion between any two bases of V. The proof of this result makes use of techniques 
from advanced set theory, such as transfinite induction. 

If V is a vector space finitely generated over a field F then V is finite dimen- 
sional and the number of elements in a basis of V is called the dimension of V 
over F’. If V is not finite dimensional, it is infinite dimensional. (In choosing this 
latter terminology, we are deliberately skipping over the subject of various transfi- 
nite dimensions, since the reader is not assumed to be familiar with the arithmetic 
of transfinite cardinals. In certain mathematical contexts, distinction between infi- 
nite dimensions—for example the distinction between spaces of countably-infinite 
and uncountably-infinite dimension—can be very significant. We will not, however, 
need it in this book.) We denote the dimension of V over F by dim(V), or by 
dimr(V) when it is important to emphasize the field of scalars. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The notion of dimension was implicit in the work of Peano, but was 
redefined and studied in a comprehensive manner by the twentieth- 
century German mathematician Hermann Weyl. 


Notice that the proof of Proposition 5.9, which is in turn critical in proving Propo- 
sition 5.10, uses the fact that every nonzero element of F has a multiplicative in- 
verse, and this cannot be avoided. If we try to weaken the notion of a vector space 
by allowing scalars to be, say, only integers, it may happen that such a space would 
have two bases of different sizes and so we could no longer define the notion of 
dimension in an obvious manner. We did not use, in an unavoidable manner, the 
commutativity of scalar multiplication and so we could weaken our notion of a vec- 
tor space to allow scalars which do not commute among themselves, such as scalars 
coming from HI. However, the generality thus gained does not seem to outweigh the 
bother it causes, and so we will refrain from doing so. Thus, for us, the fact that 
scalars always come from a field is critical in the development of our theory. 


Example If F is a field then dim(F”) =n for every positive integer n, since the 
canonical basis of F” has n elements. Similarly, if k and n are positive integers then 
dime (Mzxn(F)) = kn, since the canonical basis of Mxyn(F) has kn elements. 
The dimension of the space F'[X] is infinite since the canonical basis of F[X] has 
infinitely-many elements. 


Example If F is a field and n is a positive integer, then the set W of all polynomials 
in F[X] having degree at most n is a subspace of F[X] having dimension n + 1, 
since {1, X,..., X”} is a basis of W having n + 1| elements. 
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Example Let V be a vector space over R, Then Y = V? is a vector space over R, 
but it also has the structure of a vector space over C with the same addition and 
av, — bu2 ; : 
. This space is 
byy + | P 
called the complexification of V.If B is a basis for V over R then it is easy to check 


ov, 


generated over R then Y is finitely generated over C and dimg(V) = dimc(Y). 


with scalar multiplication given by (a + bi) 3 | = 
2 


v € B> is a basis for Y over C. Thus, in particular, if V is finitely 


With kind permission of UC Berkeley. 


Complexification of real vector spaces was first used extensively by 
the twentieth-century American mathematician Angus Taylor. 


Example The dimension of R over itself is 1. Since {1,7} is a basis of C as a vector 
space over R, we see that dimp(C) = 2 and so there cannot be a proper subfield F 
of C properly containing R. Indeed, if there were such a field, its dimension over R 
would have to be greater than | and less than 2 (else it would be equal to C), which 
is impossible. Clearly, dimp(H) = 4. It turns out that the only possible dimensions 
of division R-algebras are 1, 2, 4, and 8. The dimension 8 case is realized by a (non- 
associative) Cayley algebra over R, as defined in Chap. 15. There are no associative 
division algebras of dimension 8 over R. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 
The twentieth-century Ger- 
man mathematician Heinz 
Hopf used algebraic topology 
to prove that the only pos- 

« sible dimensions of division 
R-algebras were powers of 2, and the final result was obtained by the twentieth-century 
American mathematician Raoul Bott and contemporary American mathematician John 
Milnor, again using non-algebraic tools. 


Example Let F be a field, let (K,e) be an associative unital F-algebra, and let 
ve K.If p(X) =o ya:X! € FLX], then p(v) = \%yajv! is an element of K 
and the set of all elements of K of this form is an F-subalgebra of K , which is in fact 
commutative, even though K itself may not be. We will denote this algebra by F[v]. 
If the dimension of F'[v], considered as a vector space over F’, is finite, we know that 
there must exist a polynomial p(X) € F[X] of positive degree satisfying p(v) = 

In that case, we say that v is algebraic over F’. Otherwise, if the dimension of F[v] 
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is infinite, we say v is transcendental over F. Thus, for example, the real numbers 
mz and e (the base of the natural logarithms) are transcendental over Q. If F is a 
subfield of a field K then the set L of all elements of K which are algebraic over F 
is a subfield of K. Moreover, if K is algebraically closed, so is L, and in fact L is 
the smallest algebraically-closed subfield of K containing F. In particular, we can 
consider the field of all complex numbers algebraic over Q. This is a proper subfield 
of C, known as the field of algebraic numbers. 


With kind permission of the Archives of the Mathematis- 
ches Forschungsinstitut Oberwolfach. 

The transcendence of 2 was proven by German 
mathematician Ferdinand von Lindemann in 
1882. The transcendence of e was proven by 
French mathematician Charles Hermite in 1873. 
As we shall see later, Hermite made many impor- 
tant contributions to linear algebra. 


From the definition of dimension we see that if V is a vector space of finite 
dimension n over a field F then: 
(1) Every subset of V having more than n elements must be linearly dependent; 
(2) There exists a linearly-independent subset B of V having precisely n elements; 
(3) If B is as in (2) then B is also a generating set of V over F. 


Proposition 5.11 Let V be a vector space finitely generated over a field F 
and let W be a subspace of V. Then: 

(1) W is finitely generated over F; 

(2) Every basis of W can be extended to a basis of V; 

(3) dim(W) < dim(V), with equality when and only when W = V. 


Proof Letn = dim(V). 

(1) If W is not finitely generated, then, as we remarked after Proposition 5.3, 
W has a linearly-independent subset B having n + 1 elements. But B is also a 
subset of V, contradicting the assumption that dim(V) = n. 

(2) Let B be a basis of W. Then B is as linearly-independent set of elements of 
V and so, by Proposition 5.7, can be extended to a basis of V. 

(3) By (2), we see that the number of elements of a basis of W can be no greater 
than the number of elements of a basis of V, and so dim(W) < dim(V). Moreover, if 
we have equality then any basis B of W is also a basis of V,andsoW = FB=V. 


We now want to extend the notion of linear independence. Let U and W be sub- 
spaces of a vector space V over a field F. Any vector v € U + W can be written in 
the form u + w, where u € U and w € W, but there is no reason for this represen- 
tation to be unique. It will be unique, however, if U and W are disjoint. Indeed, if 
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this condition holds and if u, u’ € U and w, w’ € W satisfy u-+w=u' + w’, then 
u—u'=w’—weUNW andsou —u’ =0y = w — w’, which in turn implies that 
u =u’ and w = w’. To emphasize the importance of this situation, we will introduce 
new notation: if U and W are disjoint subspaces of a vector space V over a field F, 
we will write U © W instead of U + W. The subspace U @ W is called the direct 
sum of U and W . We note that, by this definition, U @ {Oy} = U for every subspace 
U of V. 


Example It is easy to see that R? =R B @R El 


Of course, we would like to extend the notion of direct sum to cover more than 
two subspaces. In general, if V is a vector space over a field F,, then a collection 
{Wr | h € 82} of subspaces of V is independent if and only if it satisfies the following 
condition: If A is a finite subset of §2 and if we choose elements wy, € Wy), for all 
he A, then er wn = Oy when and only when wy = Oy for each h € A. Thus 
we see that an infinite collection of subspaces is independent if and only if every 
finite nonempty subcollection is independent. Clearly, a subset D of a vector space 
V over a field F is linearly independent if and only if the collection of subspaces 
{Fv | v € D} is independent. 


Proposition 5.12 Let V be a vector space over a field F and let W,,..., Wn 

be distinct subspaces of V. Then the following conditions are equivalent: 

(1) {W,..., Wn} is independent; 

(2) Every vector w € ar W; can be written as w, +:+:+ Wy, with w; € W; 
for each | <i <n, in exactly one way; 

(3) Wr and Daten W; are disjoint, for each 1 <h<n. 


Proof (1) => (2): Let w € )°"_, W; and assume that we can write w= w) +---+ 
Wn = yit-+-+yn, where w;, yj € W; foreach 1 <i <n. Then )>7_,(w; — yi) = Ov 
and so, by (1), it follows that w; — y; = Oy for each | <i <n, proving (2). 

(2) => (3): Assume that Oy 4 w, € WhO Lith W;. Then for each i 4 h there 
exists an element w; € W; satisfying wy, = oiek w;, contradicting (2). 

(3) = (1): Suppose we can write w; + ---+ wy, = Oy, where w; € W; for each 
1 <i <n, and where w, ¥ Oy for some h. Then wy, = — igh wi € Wi Dich Wi, 
and this contradicts (3). Thus (1) must hold. 


If V is a vector space over a field F and if {W; | i € 2} is an independent collec- 
tion of subspaces of V, we write B;-9 Wi instead of )°;-9 Wi. If 2 = {1,...,n}, 
we will also write this sum as W; ®--- @ W,,. If V = ica W;, then we say that V 
has a direct-sum decomposition relative to the subspaces W;. 


Example If B is a basis of a vector space V over a field F then V = @,,< Fv. 
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The importance of direct-sum decompositions is illustrated by the following re- 
sult. 


Proposition 5.13 Let V be a vector space over a field F , let {Wj |i € 82} be 
a pairwise disjoint collection of subspaces of V and, for eachi € 92, let B; be 
a basis of W;. Then V = @ caw, if and only if B = eq Bi is a basis of V. 


Proof Assume V = @jeq Wi and let v € V. Then there exists a finite subset A 
of 92 such that v € Bier W;, and so for each i € A there is an element w; € W; 
satisfying v = >; <A Wi. Moreover, each w; is a linear combination of elements 
of B;. Thus v is a linear combination of elements of B, and so B is a generating 
set for V. We are left to show that B is linearly independent. If this is not the case, 
then there exist an element h of 2, vectors yj,..., y; in By, and scalars aj,..., a; 
in F, not all of which equal to 0, such that a ajvj +u=Oy, where u is a 
linear combination of elements of Uien B;. But then ye ajuj EWAN parr W;, 
contradicting our initial assumption. Thus B = U;<¢ Bi). 

Conversely, if B = Ujeg Bi, it then follows that every element of V can be 
written in a unique way as >; <A Wi, Where A is some finite subset of 2, which 
suffices to prove that V = Qjco Wi. 


Let W be a subspace of a vector space V over a field F. A subspace Y of V isa 
complement of W in V if and only if V = W @ Y. We immediately note that if Y is 
a complement of W in V then W is a complement of Y in V. In general, a subspace 
of a vector space can have many complements. 


Example Each of the following subspaces of R? is a complement of each of the 
others in R?: 


Proposition 5.14 Every subspace W of a vector space V over a field F has 
at least one complement in V . 


Proof Tf W is improper, then {Oy} is a complement of W in V. Similarly, V is a 
complement of {Oy} in V. Otherwise, let B be a basis of W. By Proposition 5.8, we 
know that there exists a linearly-independent subset D of V such that B U D isa 
basis of V. Then FD is a complement of W in V. 
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Example Let F be a field of characteristic other than 2, let n be a positive integer, 
and let V be a vector space over F. Let W = Myxn(V), which is also a vector 
space over F’. Let W) be the set of all those matrices A = [v;;] in W satisfying 
vij = vji for all 1 <i, j <n, and let W2 be the set of all those matrices A = [v;;] 
in W satisfying vj; = —v;; for all 1 <i, j <n. These two subspaces are disjoint. If 
A = [v;;] is an arbitrary matrix in W, then we can write A= B+C, where B = [);;] 
is the matrix defined by y;; = 5 (vij + v;;) for all 1 <i, j <n, and C = [z;;] is the 
matrix defined by z;; = 5 (ij — vj) for all 1 <i, 7 <n. Note that A ¢ W, and 
B&W). Thus V = W; @ Wo. 


Example A function f € R® is even if and only if f(a) = f (—a) for all a € R; it 
is odd if and only if f(a) = —f(—a) for all a € R. The set W of all even functions 
is clearly a subspace of R®, as is the set Y of all odd functions, and these two 
subspaces are disjoint. Moreover, if f € R™ then f = f; + fo, where the function 
fiixre sf (x) + f(—x)] is in W and the function fo : x bh SL f(x) — f(—x)] is 
in Y. Thus Y is acomplement of W in R®. 


Proposition 5.15 Let F be a field which is not finite and let V be a vector 
space over F having dimension at least 2. Then every proper nontrivial sub- 
space W of V has infinitely-many complements in V . 


Proof By Proposition 5.14, we know that W has at least one complement U in V. 
Choose a basis B for U. If Oy 4 w € W, then by Proposition 3.2(9) and the fact 
that F is infinite, we know that Fw is an infinite subset of W. Thus we know that 
the set W is infinite. For each w € W, let Yy = F{u + w | u € B}. We claim that 
each of these spaces is a complement of W in V. Indeed, assume that ve WN Y,y. 
Then there exist elements u1,...,uU, of B and scalars cj,...,Cn in F satisfying 
v= > ci(ui + w). But then S7_) ciuj =v — OC ci)w e WOU = {0y} 
and since the set {w1,...,4,} is linearly independent, we see that c; = 0 for all 7. 
This shows that v = Oy, and we have thus shown that W and Y,, are disjoint. If 
v is an arbitrary element of V, let us write v = x + (eae cju;), where x € W, 
the vectors u1,...,U, belong to B, and the scalars cj,...,c, belong to F. Then 
v=[x — QO ciwlt+ YL, ci(ui + w) € W t+ Ywy and thus we have shown that 
V=W+Yy and so Y, is acomplement of W in V. 

We are left to show that all of these complements are indeed different from each 
other. Indeed, assume that w # x are elements of W satisfying Y, = Y,. fue B 
then there exist elements u1,...,u, of B and scalars cj,...,c, such that u + w= 
yr ci(ui + x). From this it follows that u — )77_, ciuj = (C7) ci)x — w and 
this belongs to WM Y,, = {Ov}. But B is a linearly-independent set and so u has to 
equal to one of the u;, for some | < h <n, and we must have c; = 0 for i £h and 
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Ch = 1. Hence x — w = Oy, namely x = w. This is a contradiction, and so the Y,, 
must all be distinct. 


Proposition 5.16 (Grassmann’s Theorem) Let V be a vector space over a 
field F and let W and Y be subspaces of V satisfying the condition that W + Y 
has finite dimension. Then dim(W + Y) = dim(W) + dim(Y) — dim(W NY). 


Proof Let Uj) = WY, which is a subspace both of W and of Y. In particular, Uo 
has a complement U; in W and a complement U2 in Y. ThenW+ Y=Up+U, + 
U2. We claim that in fact W + Y = Up ® U1 @ U2. Indeed, assume that wo + uy + 
u2 = Oy, where u; € U; for j =0, 1,2. Then uj = —u2 —up €e WN Y = Uo. But 
Uo and U; are disjoint and so u; = Oy. Therefore, ug = —u2 € Up N U2 = {Oy}. 
Therefore, wo = Oy and u2 = Oy as well. Thus we see that the set {Up, U;, U2} is 
independent. Therefore, from the definition of the complement, we have 


dim(W + Y) = dim(Uo) + dim(U)) + dim(U2) = dim(W) + dim(U2) 


and this equals dim(W) + dim(Y) — dim(W NY) since Y= U2 ® (WNY). 


Example Consider the subspaces 


1 1 1 0 
W,|=R 0;,| 2 and W2=R 1],/ 1 
2 2 0 1 


of R?. Each one of these subspaces has dimension 2, and so we see that 2 < 
dim(W, + W2) < 3. By Proposition 5.16, we see that, as a result of this, we have 
1 < dim(W, M W2) < 2. In order to ascertain the exact dimension of W; 1 Wo, 
we must find a basis for it. If v € W; M W2 then there exist scalars a,b,c,d sat- 


1 1 1 0 
isfyinga]O0)/+b]2]=c}]1|]+d|1],andsoa+b=c, 2b=c+d, and 
2 2 0 1 
2a + 2b = d, from which we conclude that b = —3a, c = —2a, and d = —4a. Thus 
1 0 —2 
v has to be of the form (—2a) | 1 | + (—4a) | 1 | =a} —6 |, which shows that 
0 1 —4 
—2 
W,1W2=R| —6 |, and so it has dimension 1. 
—4 


Very often, we can reduce our computations by passing to complements. A good 
example of this is given by the following proposition. 


78 5 Linear Independence and Dimension 


Proposition 5.17 Let V be a vector space over a field F and let W be a 
subspace of V having a complement Y in V. Let {v1,...,Un} be a subset 
of V and, for each 1 <i <n, let vj = wi + yi, where wi € W and y; EY. 
If the vectors w1,...,W, are distinct and the set {w,,..., Wn} is linearly 
independent, then so is the set {v,..., Un}. 
Proof Assume that there exist scalars a1,..., a, satisfying ye ajvj = Oy. Then 
1 GW + >) Gi yi = Ov, and so )°"_, ajw; = )-/_, ai yi = Ov. Since the vec- 
tors wW1,..., Wy, are distinct and {w1,..., w,} is linearly independent, we must have 
a, =--:=a, = 0, and so {v1}, ..., Un} is linearly independent as well. 
Exercises 


Exercise 158 

Let v1, v2, and v3 be distinct elements of a vector space V over a field 
F and let cj,c2,c3 € F. Under what conditions is the subset {c2v3 — c302, 
C1 V2 — C2U1, C3U, — C1 v3} of V linearly dependent? 


Exercise 159 
For which values of the real number f is the subset 


cos(t) + i sin(t) 1 
1 >! cos(t) — i sin(f) 


of C* linearly dependent? 


Exercise 160 

Let F be a field and let V be the subspace of F[X] consisting of all those 
polynomials of degree at most 4. Let p1(X),..., ps(X) be distinct polynomi- 
als in V satisfying the condition that p;(0) = 1 for each 1 <i <5. Is the set 
{pi(X),..., p5(X)} necessarily linearly dependent? 


Exercise 161 
Consider the functions f : x +> 5* and g:xt> 57%. Is {f,g} a linearly- 


dependent subset of R®? 


Exercise 162 


N 
Q 


Find a, b € Q such that the subset a—b|,|b of Q? is linearly depen- 


dent. 
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Exercise 163 


4 1 1 
Let F = Q. Is the subset 2),)0],)3 of F? linearly independent? 
1 0 4 


What happens if F = GF(5)? 


Exercise 164 

Let V be a vector space over a field F and let n > 1 be an integer. Let Y be the 
v1 

set of all vectors | : | € V” satisfying the condition that the set {v1,..., un} is 
Un 

linearly dependent. Is Y necessarily a subspace of V”? 


Exercise 165 
1+i 1-i 1+i 
Is the subset 348i}, 5 ,| 3427 of C? linearly independent 
5+ 7i 2+i 4-i 
when we consider C” as a vector space over C? Is it linearly independent when 
we consider C? as a vector space over R? 


Exercise 166 
For each nonnegative integer n, let f, € R®™ be the function defined by fy : xt 
sin” (x). Is the subset { f, | n > 0} of R® linearly independent? 


Exercise 167 
Let V = C(—1, 1), which is a vector space over R. Let f, g € V be the functions 
defined by f :xt> x? and g:x + |x|x. Is {f, g} linearly independent? 


Exercise 168 
Let V be a vector space over GF(5) and let v1, v2,v3 € V. Is the subset 
{vy + v2, v1 — v2 + V3, 2v2 + V3, v2 + v3} of V linearly independent? 


Exercise 169 

Let F be a field of characteristic different from 2 and let V be a vector space 
over F containing a linearly-independent subset {v, v2, v3}. Show that the set 
{vy + v2, v2 + V3, vy + v3} is also linearly independent. 


Exercise 170 


Is the subset of GF(3)* linearly independent? 


NNR rR 
areas 
<eovene 
=e 
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Exercise 171 


ail 

Let t < n be positive integers and, for all 1 <i <1, let vj = : be a vector in 
ain 

IR" chosen so that 2|a;;| > = |a;;| for all 1 < j <n. Show that {vy,..., v;} 


is linearly independent. 


Exercise 172 
If {v,, v2, v3, v4} is a linearly-independent subset of a vector space V over the 
field Q, is the set 


{3v, + 2v2 + v3 + v4, 201 + 52, 3v3 + 204, 3v1 + 402 + 203 + 3v4} 
linearly independent as well? 


Exercise 173 

Let A be a subset of R having at least three elements and let f), fo, f3 € R4 
be the functions defined by f; : xt x'~!2*—!. Is the set { fi, fo, f3} linearly 
independent? 


Exercise 174 

Let F = GF(5) and let V = F”, which is a vector space over F. Let f : x > x? 
and g: x +> x be elements of V. Find an element h of V such that { f, g, h} is 
linearly independent. 


Exercise 175 
Consider R as a vector space over Q. Is the subset {(a — 2)~! | a € Q} of this 
space linearly independent? 


Exercise 176 

In the vector space V = R® over R, consider the functions fiixp In((x? + 
Patt), forxt In(V/x2+D, and fg: x + In(xt + 7). Is the subset 
{ fi, fo, fa} of V linearly independent? 


Exercise 177 


Show that the subset 2/,} 1],] 0 of GF(p)? is linearly independent 


Oo 
— 


if and only if p 43. 


Exercise 178 

Let F be a subfield of a field K and let n be a positive integer. Show that a 
nonempty linearly-independent subset D of F” remains linearly independent 
when considered as a subset of K”. 
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Exercise 179 
Let F = GF(5) and let V = FF. For 4<k <7, let Sk € V be defined by fy : 
at» a* Is the subset { f, | 4 < k < 7} of V linearly independent? 


Exercise 180 

Let V be a vector space over R. For vectors v 4 w in V, let K(v, w) be the 
set of all vectors in V of the form (1 — a)v + aw, where 0 < a < 1. Given 
vectors v, w, y € V satisfying the condition that the set {w — v, y — v} is linearly 
independent (and so, in particular, its elements are distinct), show that the set 


1 1 1 
K(v, ae + ») a) K(w, aw + ») fal K(». a + w) 
is nonempty, and determine how many elements it can have. 


Exercise 181 

Let V be a vector space finitely generated over a field F and let B = {vj,..., vn} 
be a basis for V. Let y € V \ B. Show that the set {vj,..., vy», y} has a unique 
minimal linearly-dependent subset. 


Exercise 182 
Find all of the minimal linearly-dependent subsets of the subset 


to) Ls} Lo) Col ED 


Exercise 183 

Let V be a vector space over a field F and let D and D’ be distinct finite minimal 
linearly-dependent subsets of V which are not disjoint. If v € DM D’, show that 
(D U D’) ~ {v} is linearly dependent. 


Exercise 184 

Let F be a field of characteristic other than 2. Let V be the subspace of F[X] 
consisting of all polynomials of degree at most 3. Is {X +2, X? +1, X34 X?, 
X3— X*)} a basis of V? 


Exercise 185 
Let {vj,...,U,} be a basis for a vector space V over a field F. Is the set 
{v1 + v2, v2 + V3,..., Un—1 + Un, Un + U1} Necessarily also a basis for V over F? 


Exercise 186 
Is {1+ 2/5, — 3 + V5} a basis for Q(/5) as a vector space over Q? 
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Exercise 187 
For which values of a € R is the set 


a 2a 1 2 1 2a 1 a+l 
2 3Ba|’}2a 3]’?}at+l1 a+2]’|2 2a4+1 
a basis for M>2,.2(R) as a vector space over R? 


Exercise 188 
Let F be an algebraically-closed field and let (K, e) be an associative F-algebra 
having a basis {v,, v2} as a vector space over F’. Show that iv = v2 or v5 = Ox. 


Exercise 189 

Let V be a vector space over a field F'. A nonempty subset U of V is nearly 
linearly independent if and only if U is linearly dependent but U ~ {u} is lin- 
early independent for every u € U. Find an example of a set of three vectors 
in R? which is nearly linearly independent. Does there exist a nearly linearly 
independent subset of R? having four elements? 


Exercise 190 
Find a basis for the subspace W of R* generated by 


4 1 1 1 
2 -1 2 5 
6 |’ 3}°>}0]°} -3 
—2 -1 0 1 


Exercise 191 
For each real number a, let f,, € R®™ be defined by 


1 ifr=a, 
0 otherwise. 


fairs | 


Is { fa |a € R} a basis for R® over R? 


Exercise 192 

Let A be a nonempty finite set and let V be the collection of all subsets of A, 
which is a vector space over GF(2). For each a € A, let vg = {a}. Is {vg | a € A} 
a basis for V? 


Exercise 193 
1 0 0 1 0 -i 1 O : : 
Show that {5 AEE er alas 2 | is a basis forthe vector 


space M2 2(C) over C. (The last three of these matrices are known as the Pauli 
matrices and play a very important part in the formulation of quantum physics. 
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Exercise 194 
Let F be a field and let a, b,c € F. Determine whether 


is a basis for F?. 


Exercise 195 
Let V be a vector space finitely generated over a field F having a basis 
{u,,..., Un}. Is {uv}, pa Uj, +++» ).;—1 Vi} necessarily a basis for V? 


Exercise 196 

Let F = GF(p) for some prime integer p, let n be a positive integer, and let V be 
a vector space of dimension n over F’. In how many ways can we choose a basis 
for V? 


Exercise 197 
Let V be a three-dimensional vector space over a field F’, with basis {v 1, v2, v3}. 
Is {vy + v2, v2 + v3, vy — U3} a basis for V? 


Exercise 198 


Let V be a vector space of finite dimension n over C having basis {v1,..., Un}. 
Show that {v1,..., Un, i01,...,iU,} is a basis for V, considered as a vector space 
over R. 


Exercise 199 
Let V be a vector space of finite dimension n > 0 over R and, for each positive 
integer i, let U; be a proper subspace of V. Show that V A U7, Ui. 


Exercise 200 

Let V be a vector space over a field F which is not finite dimensional, and 
let W be a proper subspace of V. Show that there exists an infinite collection 
{Y1, Yo,...} of subspaces of V satisfying ()72, Yi C W but ();_, Yi Z W for all 
n>1. 


Exercise 201 

Let V be the subspace of R[X] consisting of all polynomials of degree at most 5, 
and let A = {X° + X*, X° — 7X3, X° — 1, X° + 3X}. Show that this subset of V 
is linearly independent and extend it to a basis of V. 


Exercise 202 

Let V be a vector space of finite dimension n over a field F, and let W be a 
subspace of V of dimension n — 1. If U is a subspace of V not contained in W, 
show that dim(W NU) = dim(U) — 1. 
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Exercise 203 

Let a,b, c,d be rational numbers such that {a + cV3,b + dv/3} is a basis for 
Q(V/3) as a vector space over Q. Is {c + aV/3,d + bV3} a basis for Q(/3) as a 
vector space over Q? Is {a+ cV/5, b+ d/5} a basis for Q(/5) as a vector space 
over Q? 


Exercise 204 
Find a real number a such that 


—9 2 1 3 —1 
a —5 4 —1 9 
dim] R —-l], 3],] -l], 2),| -4 =2 
—5 0 1 1 1 
—14 2 2 4 0) 
Exercise 205 
2 1 —l 
1 2 1 4 : . : 
LetW=R 3/-lole| 3 C R". Determine the dimension of W and 
1 1 0 
find a basis for it. 
Exercise 206 
Consider the vectors 
0 7 0) 1 0) 
1 4 3 9 1 
vy=10O}], v=] 1], wy=]0], va=}]5], and vys=]| 0 
1 8 4 7 5 
0 3 0 1 0 


in the vector space Q. Do there exist rational numbers a;;, for 1 <i, j <5, such 
that the subset i= ,ajvj |1<i<5}of Q? is linearly independent? 


Exercise 207 

Let F be a subfield of a field K satisfying the condition that K is finitely gener- 
ated as a vector space over F. For each c € K, show that there exists a nonzero 
polynomial p(X) € F[X] satisfying p(c) = 0. 


Exercise 208 
1 —1 —5 
_ 2 1 4 _ —1 6 
Let W=R i|> 1 C R* and let V=R ol: 3 . Com- 
0 1 1 0 


pute dim(W + V) and dim(W 1 V). 
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Exercise 209 

Let F be a subfield of a field K satisfying the condition that the dimension of K 
as a vector space over F is finite and equal to r. Let V be a vector space of finite 
dimension n > 0 over K. Find the dimension of V as a vector space over F’. 


Exercise 210 

Let V be a vector space over a field F having infinite dimension over F. Show 
that there exists an infinite sequence W,, W2,... of proper subspaces of V, satis- 
fying Ue, Wi = V. 


Exercise 211 
Let F = GF(p), where p is a prime integer, and let V be a vector space over F 
having finite dimension n. How many subspaces of dimension | does V have? 


Exercise 212 

Let W be the subset of R™ consisting of all functions of the form x 
a-cos(x — b), for real numbers a and b. Show that W is a subspace of RE and 
find its dimension. 


Exercise 213 


4 6 1 4 1 
_ 3 2 1 4 = 2 0 4 
LettW=R atolololy CR* andletY =R 01°13 CR’. 
1 2 pe —2 2 


Find dim(W + Y) and dim(W NY). 


Exercise 214 
Let V be a vector space of finite dimension n over a field F' and let W and Y be 
distinct subspaces of V, each of dimension n — 1. What is dim(W MY)? 


Exercise 215 
Let V be a finite-dimensional vector space over a field F and let B be a basis of 


V such that {| | Wwe a} is a basis for V2. What is the dimension of V? 


Exercise 216 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of degree at most 4. Find a complement for V in FX]. 


Exercise 217 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of the form (X? + X + 1) p(X) for some p(X) € F[X]. Find a complement for 
V in F[X]. 
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Exercise 218 

Let B be a nonempty proper subset of a set A. Let F be a field and let V = F4. 
Let W be the subspace of V consisting of all those functions f € V satisfying 
f (6) = 0 for all b € B. Find a complement of W in V. 


Exercise 219 
Let F be a field of characteristic other than 2, let V be a vector space over F’, and 


v v 

let U = v vveVS CVA ISY= v v € V ¢ acomplement 
v+u’ v 

of U in V? 

Exercise 220 


Let F be a field and let p(X) € F[X] have positive degree k. Let W be the 
subspace of F[X] composed of all polynomials of the form p(X)g(X) for some 
g(X) € F[X]. Show that W has a complement in FX] of dimension k. 


Exercise 221 

Let V be a vector space over a field F' which is not finite dimensional, and let 
V>W, D W2>.--- be a chain of subspaces of V, each properly contained in 
the one before it. Is the subspace (72, W; of V necessarily finite-dimensional? 


Exercise 222 

Let V be a vector space finitely generated over a field F. Let W and Y be 
subspaces of V and assume that there is a function f € F” satisfying the 
condition that f(w) < f(y) for all Oy 4 w € W and Oy # ye Y. Show that 
dim(W) + dim(Y) < dim(V). 


Exercise 223 

Let (K,e) be a division algebra of dimension 2 over R containing an element 
v, which satisfies the condition that v} ev =v=ve vy forall v € V. Show that 
(K,+,e) is a field. 


Exercise 224 

For each a € R, the set Q[a] = {p(a) | p(X) € Q[X]} is a subspace of R, con- 
sidered as a vector space over Q. Find all pairs (a,b) of real numbers a 4 b 
satisfying the condition that the set {Q[a], Q[b]} is independent over Q. 


Exercise 225 

Let V be a vector space over a field F. Find a necessary and sufficient condition 
for there to exist subspaces W and W’ of V such that {{Oy}, W, W’} is indepen- 
dent. 


Exercise 226 
Let (K, e) be a unital R-algebra (not necessarily associative) with multiplicative 
identity e, and let {v; | i € 2} be a basis for K over R containing e (which is 
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equal to v, for some t € 82). If v= )ojcg civ; € K, set V= cy, — ae Cj Uj. 
If ve K, is it true that ve D=Vev and v=v? (Note that this construction 
generalizes the notion of the conjugate of a complex number.) 


Exercise 227 

For each nonnegative integer n, define the subsets P,, An, and F, of R as fol- 

lows: 

(1) Po = @, Ao = {1}, and Fp = Q; 

(2) Ifn > 0, then P, is the set of the first n prime integers, A, consists of 1 and 
the set of square roots of products of distinct elements of P,, and F, = QAn. 

Show that each A, is a linearly-independent subset of IR, considered as a vector 

space over Q, and that F, is a subfield of IR, having the property that every 

element of F,, the square of which belongs to Q must belong to Qa, for some 

aé Ay. 


Exercise 228 

Find all a € R (if any exist) satisfying the condition that the dimension of 
-1 2 1 

R 2a |, 2) 4) 1 is at most 2. 
—2 -1 0 


Exercise 229 

Give an example of a vector space V finitely generated over a field F’, together 
with nonempty subsets B;, Bz, and B3 of V satisfying the following conditions: 
(1) Each B; is linearly independent; 

(2) For each 1 <i ¥ j <3 there exists a basis of V containing B; U B;; 

(3) There is no basis of V containing B,; U Bz U B3. 


Exercise 230 

Let V be a vector space over R. A fuzzification of V is a function from V to 
the unit interval I of real numbers, satisfying the condition that w(av + bw) > 
min{j(v), “(w)} for all a,b € R and all v, w € V. A finite nonempty linearly- 
independent subset {v1,..., v,} of V is u-linearly independent if and only if it 
satisfies the additional condition that wOOY ajv;) = min{ajv1,..., An Up}. 

(1) Show that the function yz : R* > I defined by 


1 ifa=b=0, 
LL Hit 1 ifa=Oandb <0, 
i otherwise 


is a fuzzification on V. 


(2) Is the linearly-independent subset A ; | 


| of R also y-linearly inde- 


pendent? 
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Exercise 231 

Let V be a vector space over a field F and let B be a fixed basis of V. We then 
know that each element v € V can be written in a unique way as v = )) neg CwW, 
where the c,, are scalars, only finitely-many of which are nonzero. Let n(v) be 
the number of nonzero scalars c,, in this representation. (Note that n(v) = 0 if 
and only if v = Oy.) Define a relation < on V by setting v; X v2 if and only if 
n(v1) <n(v2). Is this a partial order relation on V? 


Exercise 232 
Let V be a vector space over a field F and let D be a finite minimal linearly- 
dependent subset of V. Find dim(F'D). 


Linear Transformations 


Let V and W be vector spaces over a field F. A function a: V > W is a linear 
transformation or homomorphism if and only if for all v}, v2 € V and a € F we 
have a(vy + v2) = a(v1) + a(v2) and a(av,) = aa(v1). We note that, as a con- 
sequence of the second condition, we have a(Oy) = a(O0y) = Oa(Oy) = Ow. If 
(K,e) and (L, *) are F-algebras, then a linear transformation a: K > L isa ho- 
momorphism of F-algebras if it is a linear transformation and, in addition, satisfies 
a(vy ev2) =a(v1) *a(v2) for all vy, v2 € K. If both K and L are unital, then it is a 
homomorphism of unital F -algebras if it also sends the identity element of K for e 
to the identity element of L for *. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Linear transformations between finite-dimensional vector spaces were 
studied by Peano. Linear transformations between infinite-dimensional 
spaces were first considered in the late nineteenth century by Italian 
mathematician Salvadore Pincherle. 


Example Let V be a vector space over a field F’. Every scalar c € F defines a linear 
transformation o, : V — V given by o, : vt cv. In particular, o; is the identity 
function v +> v and oo is the 0-function v +> Oy. 


Example Let F be a field and let a),...,a6 be scalars in F. The function 
ajc, + a2c2 
a: F? + F? defined bya: Bs +> | a3c,1 +a4c2 | is a linear transformation. 
2 a5C\ + a6C2 


The previous example can be generalized in an extremely significant manner. Let 
k and n be positive integers and let F be a field. Every matrix A = [a;j;] € Mgxn(F) 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 89 
DOI 10.1007/978-94-007-2636-9_6, © Springer Science+Business Media B.V. 2012 
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defines a linear transformation from F” to F* given by 


ial ayicy +++ + ainn 

C2 a21C] +++++ a2nCn 
ie : 

Ch AKC] + +++ + Aknen 


In what follows, we will show that every linear transformation from F” to F* can 
be defined in this manner. 


Example Let F be a field of characteristic 0. Then there are linear transformations 
a and 6 from F[X] to itself defined by 


lo.) lo.) [o.@) [o.@) 
a: aX! ) ia: x! and B: > ajX'h Sod +i) la Xi 
i=0 i=0 i=0 i=0 


(By 1+i7, we mean the sum of | +i copies of the identity element for multiplication 
of F; since the characteristic of F is 0, we know that this element is nonzero, and 
so is a unit in F.) 


Example Let V and W be vector spaces over a field F and let k and n be positive 
integers. For all 1 <i <k and1 <j <n, leta;; : V > W bea linear transformation. 
Then there is a linear transformation from Mzxn(V) to Mkxn(W) defined by 


VIE oe. Vin y(t) «en (Vin) 
: be 


Vil ++. Ukn Oi (UEL) «+ kn (Vkn) 


Example Let V be the subspace over R® consisting of all differentiable functions. 
For each f € V, we define a function Df :R x R— R, called the differential of f, 
by setting Df : (a,b) f’(ab, where f’ is the derivative of f. Then the function 
D:V — R®*® given by f + Df isa linear transformation. Such linear transfor- 
mations play an important part in differential geometry. 


Example Sometimes linear transformations between F'-algebras which are not ho- 
momorphisms of F-algebras play an important role. Let (K,e) be an associative 
algebra over a field F and let c € F. Then K is a Baxter algebra over F of weight 
c if and only if there exists a linear transformation a: K — K satisfying the con- 
dition that a(x) ea(y) =a(a(x)ey)+a(xea(y))+ca(xe y) forallx,yeK. 
Thus, for example, if K is the R-algebra of all continuous functions from R to itself, 
the linear transformation a: K — K given by a(f): tbh is J (s) ds defines on K 
the structure of a Baxter algebra of weight 0. If F is any field and if K = F® with 
componentwise addition and multiplication, then the function a: K — K given by 
a: [aj,a2,...]'> [aj,a, + a2, a, + a. + a3,...] defines on K the structure of a 
Baxter algebra of weight 1. 
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Example Linear transformations are considered nice from an algebraic point of 
view, but may be less so from an analytic point of view. Let B be a Hamel ba- 
sis of R over Q. Then for each real number r there exists a unique finite sub- 
set {ui (r),...,Unyy(r)} of B and scalars aj(r),...,dn(7)(r) in Q satisfying r = 
Bae aj(r)u;(r). The function from R to R defined by r > pa a;(r) is a linear 
transformation, but is not continuous at any r € R. 


Let V and W be vector spaces over a field F. To any function f : V — W we 


can associate the subset gr(f) = {| ve v} of V x W, called the graph 


v 
fo) 
of f. We can use the notion of graph to characterize linear transformations in terms 
of subspaces. 


Proposition 6.1 Let V and W be vector spaces over a field F and let 
a:V— W be a function. Then a is a linear transformation if and only if 
gr(a) is a subspace of V x W. 


Proof Assume that a is a linear transformation. If v, v’ € V and c € F then in 


v v’ v+v’ vu+u’ 
Vas tenets EA + Ea 7 Pee 7 paw = 


v cu cu 
ane ae ~ ca(v) = a(cv) 
under taking sums and scalar multiples, and so is a subspace of V x W. 
Conversely, if it is such a subspace then for v,v’ € V and c € F we note that 


v v! v+u' 
at + Lae = a | es) € gr(a), and so we must have 


a(v) +a(v’) =a(v 4+ v’). Similarly, c EA = lea € gr(q), and so we must 


€ gr(a), showing that gr(a) is closed 


have ca(v) = a(cv). Thus q@ is a linear transformation. 


Let V and W be vector spaces over a field F’. If a and f are linear transformations 
from V to W, they are, in particular, functions in w’, and so the function a + 6: 
V > W is defined by a+ B: vb a(v) + B(v) for all v € V. For all v, v’ € V and 
all c € F, we have 


(a+ B)\(v+v')=a(v+v’) + Bpv+v) 
= a(v) +a(v’) + B(v) + B(v’) 
= (a+ B)(v) + (a+ B)(v’) 


and (a+ B)(cv) =a(cv)+ B(cv) =ca(v)+cB(v) =cla(v)+ B(v)] =c(a+8)(v). 
Thus we see that a + # is a linear transformation from V to W. If ce F isa 
scalar then the function ca from V to W is defined by ca : v + ca(v) and this, 
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again, is a linear transformation from V to W. It is easy to check that the set of all 
linear transformations from V to W is a subspace of W”,, which we will denote by 
Hom(V, W), or Hom (V, W) in case the field needs to be emphasized. 


Example Let V and W be vector spaces over R and let having complexifications 

U and Y, respectively. If a €e Homp(V, W) then the function be be len | 
2 2 

belongs to Homc(U, Y). 


Since Hom(V, W) is a vector space, we can apply concepts we have already 
considered for vectors to linear transformations. For example, we can talk about 
a linearly-dependent or linearly-independent set of linear transformations from a 
vector space V over a field F to a vector space W over F. However, we must be 
very careful to remember that when we are doing so, we are working in the space 
Hom(V, W), and not in either V or W. The following example illustrates the pitfalls 
one can encounter. 


Example Let V and W be vector spaces over the same field F. A nonempty sub- 
set D = {a1,...,@,} of Hom(V, W) is locally linearly dependent if and only if 
the subset {a1(v),...,@n(v)} of W is linearly dependent for every v € V. If D is 
a linearly-dependent subset of Hom(V, W), then there exist scalars cj,..., Cn, not 
all of which are equal to 0, such that }~"_, cia; is the 0-function. In particular, 
for each v € V we see that )~/_, cjaj(v) = Ow and so D is locally linearly de- 
pendent. The converse, however, is false. It may be possible for D to be linearly 
independent and still locally linearly dependent. To see this, take V = W = F? 


and let D = {a,,a2} C Hom(F?, F?), where we define q, : | be H and 


a2: | we Hi If ve F?, then {a1(v), @2(v)} is a subset of the one-dimensional 


subspace F’ ; of F* and so cannot be linearly independent. On the other hand, D 


is linearly independent since if there exist scalars c and d satisfying the condition 
that ca; + daz is the 0-function, then 


fo} =ee+([o]) +e (Lo) =Lo}+L0)=Lo} 


which implies that c = 0. Similarly, 


En G) «DE l-€) 


The following proposition shows that the operation of a linear transformation is 
entirely determined by its action on elements of a basis. This result is extremely 
important, especially if the vector spaces involved are finitely generated. 


6 Linear Transformations 93 


Proposition 6.2 Let V and W be vector spaces over a field F, and let B 
be a basis of V. If f € W® then there is a unique linear transformation 
a € Hom(V, W) satisfying the condition that a(u) = f (u) for allu € B. 


Proof Since B is a basis of V, we know that each vector v € V can be written as a 
linear combination v = )7/_, aju; of elements of B in a unique way. We now define 
the function a: V > W bya: vt )-7_, a; f (uj). This function is well defined as 
a result of the uniqueness of representation of v, as was shown in Proposition 5.4. 
Moreover, it is clear that w is a linear transformation. If 6 : V — W is a linear 
transformation satisfying the condition that B(u) = f(u) for all u € B then B(v) = 
BOC) Givi) = 7, ai Bui) = 07_, ai f (ui) = @(v), and so B = a. Thus a@ is 


unique. 


Example Let F bea field and let co, c;, ... be a sequence of elements of F’. Then we 
have a linear transformation a : F[X]— F defined by a: )7)_)ajX' tH 7} py aici. 


Example We can use Proposition 6.2 to show how uncommon linear transforma- 
tions really are. Let F = GF(3) and let V = F 4 Then V has 34 = 81 elements and 
so the number of functions from V to itself is 818!. On the other hand, a basis B 
for V over F has 4 elements and so, since every linear transformation from V to 
itself is totally determined by its action on B and that any function from B to V de- 
fines such a linear transformation, we see that the number of linear transformations 
from V to itself is 81+. Therefore, the probability that a randomly-selected function 
from V to itself be a linear transformation is 814/818! = 81—’7, which is roughly 
0.11134 x 10-6, 


Proposition 6.3 Let V, W, and Y be vector spaces over a field F and let 
a:V— Wand B:W—- Y be linear transformations. Then Ba: V > Y is 
a linear transformation. 


Proof If v1, v2 € V and if a € F then 
(Ba)(v + v2) = B(a(v1 + v2)) = B(a(v1) + a(v2)) 
= B(a(v1)) + B(a(v2)) = (Ba) (v1) + (Ba)(v2) 


and (Ba)(cv1) = B(a(cv1)) = B(ca(v1)) = cB(a(v)) = c(Ba)(v1), which proves 
the proposition. 


Example It is often important and insightful to write a linear transformation as a 
composite of linear transformations of predetermined types. Consider the following 
situation: Let a < b be real numbers and let V be the vector space over R consisting 
of all functions from the closed interval [a,b] to R. Let W be the subspace of V 
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consisting of all differentiable functions, and let 5: W — V be the function which 
assigns to each function f € W its derivative. For each real number a < c < p, let 
&-: V > R be the linear transformation defined by e, : g + g(c). Then the Interme- 
diate Value Theorem from calculus says that the linear transformation 6B: W > R 
defined by 6: fe [f(b) — f@](b—- a)~! is of the form ¢,6 for some c. 


Let V and W be vector spaces over a field F and let a: V > W be a linear 
transformation. For w € W, we denote {v € V | a(v) = w} by a—!(w). Note that 
this set may be empty. In particular, we will be interested in a~'(Ow) = {v € V | 
a(v) = Ow}. This set is called the kernel of a and is denoted by ker(a). Then ker(@) 
is never empty, since it always contains Oy. If U is a nonempty subset of W, set 
a!(U) = {a—!(w) | u € U}. It is easy to verify that a—'(U) isa subspace of V 
whenever U is a subspace of W. 


Example Let F be a field and let a © Hom(F 3 F*) be the linear transformation 


a—b 
a 0 a 
defined bya: |b} bw - . Then ker(a) = a||aceF 
Cc 0 
c 


Proposition 6.4 Let V and W be vector spaces over a field F and let 
a € Hom(V, W). Then ker(a) is a subspace of V, which is trivial if and only 
if a is monic. 


Proof Let v1, v2 € ker(@) and leta € F. Then a(vj + v2) = a(vy) +a (v2) = Ow + 
Ow = Ow, and so vj + v2 € ker(@). Similarly, a(av;) = aa(v1) = aOw = Ow and 
so av; € ker(a@). This proves that ker(a) is a subspace of V. 

If w is monic then w~!(w) can have at most one element for each w € W, and 
so, in particular, ker(~) = {Oy}. Conversely, suppose that ker(q@) is trivial and that 
there exist elements v; 4 v2 of V satisfying a(vj) = a(v2). Then a(v; — v2) = 
a(vy) — a(v2) = Ow and so vy — v2 € ker(@). Thus vj — v2 = Oy and so v1 = v2, 
which is a contradiction. Hence a must be monic. 


Let V and W be vector spaces over a field and let a: V — W be a linear 
transformation. The image of a is the subset im(@) = {a(v) | v € V} of W. This 
set is nonempty since Ow = a(Ov) € im(@). Note that w € im(q@) if and only if 
a—!(w) +# ©. If U is anonempty subset of V, we denote the subset {a(u) | u € U} 
of W by a(U). Thus a(V) =im(a@). 


Proposition 6.5 Let V and W be vector spaces over a field F and let 
a €Hom(V, W). Then im(a) is a subspace of W, which is improper if and 
only if a is epic. 
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Proof If a(v,) and a(v2) are in im(@) and if a € F, then a(vj) + a(v2) = 
a(v; + v2) € im(@) and similarly aa(v;) = a(av}) € im(@), proving that im(q@) is a 
subspace of W. The second part follows immediately from the definition of an epic 
function. 


A monic linear transformation between vector spaces over a field F is called a 
monomorphism; an epic linear transformation between vector spaces is called an 
epimorphism. A bijective linear transformation between vector spaces is called an 
isomorphism. If both spaces are also F-algebras, then a bijective homomorphism 
of F-algebras is called an isomorphism of F-algebras. Similarly, a bijective homo- 
morphism of unital F'-algebras is an isomorphism of unital F -algebras. 


Example Let F be a field and let k and n be positive integers. For each ma- 

trix A=[a;;]€ Mikxn(F), we can define the transpose of A to be the matrix 

Ale Mnxk(F) obtained from A by interchanging its rows and columns. In other 
aj see Akl 


words, A? = > *. tJ. Itis easy to check that the function A t> A? isan 


Gin «++ Akn 
isomorphism from Mixn(F) to Mnxk(F). 


Example Let K and L be F-algebras. It is possible for a linear transformation 
a: K — L to be an isomorphism of vector spaces without being an isomor- 
phism of F-algebras. This is the case, for example, with the linear transformation 


a: Q(V2) > Q(/5) given by a:a+bV2Ha+bJ/5. 


Example Let V be a vector space over a field F. Any linear transformation 
a: V — F other than the 0-function is an epimorphism. Indeed, if w is a nonzero 
linear transformation and if vo € V satisfies the condition that a(vg) = c 4 0, then 
for any a € F we have a = (ac~!)c = (ac™!)a(vp) = a((ac~!)ug) € im(a). 


Example Let F be a field and let a : F‘°) — F[X] be the function defined by 
a: f +» 2 f@)X', which is well-defined since only finitely-many of the f (i) 
are nonzero. This is easily checked to be an isomorphism of vector spaces. 


We have already seen that if D is a basis of a vector space V over a field F then 
there exists a bijective function 6 : F‘?) > V, and it is easy to verify that this is in 
fact an isomorphism of vector spaces. This leads us to the very important observa- 
tion that for any nontrivial vector space V over a field F there exists a nonempty set 
Q@ and an isomorphism F“?) - V. 

Let V and W be vector spaces over a field F' and let B be a basis of V. Then we 
can define a function g : Hom(V, W) > w? by restriction: g(a) : ut» a(u) for all 
u € B. It is straightforward to check that g is a linear transformation of vector spaces 
over F. Moreover, by Proposition 6.2, we see that any function f € W® is of the 
form g(a) for a unique element a of Hom(V, W). Therefore, g is an isomorphism. 
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Let V and W be vector spaces over a field F. If ~@: V — W is a linear transfor- 
mation and Ow 4 w € W then a~!(w) is not a subspace of V. However, the next 
result shows that, if it is nonempty, it is close to being a subspace. 


Proposition 6.6 Let a: V — W be a linear transformation of vector spaces 
over a field F and let w € im(a). For any vp € a~!(w) we have a~!(w) = 
{vu + v9 |v €ker(a)}. 


Proof If v € ker(@) then a(v + vo) = a(v) + a(vo) = Ow + w = w and so v+ v9 € 
a! (w). Conversely, if vj € a—!(w) then vj = (vj — vo) + vo, where v1 — v9 € 
ker(a@) since a(vj — v9) = a(v1) —a(vo) = w—w=Oyp. 


Note that if w ~ Ow then a~!(w) is not a subspace of V but rather the result 
of “shifting” a subspace by adding a fixed nonzero vector to each of its elements. 
Such a subset of a vector space is called an affine subset, or linear variety of a 
vector space. Let V and W be vector spaces over a field F. An affine transformation 
¢:V— Wisa function of the form v+> a(v) + y, for some fixed a €e Hom(V, W) 
and y € W. It is clear that the sum of two affine transformations is again an affine 
transformation, as is the product of an affine transformation by a scalar, so that 
the set Aff(V, W) of all affine transformations from V to W is also a subspace 
of W” which in turn contains Hom(V, W) as a subspace. Indeed, Aff(V, W) = 
F(Hom(V, W) U K), where K is the set of all constant functions from V to W. 

Moreover, if ¢ : V — W is the affine transformation defined by v +> a(v) + y 
and if w € W, then c-!(w) =a !(w— y) and so is an affine subset of V. 

Analysis of computational procedures in linear algebra often hinges on the 
fact that when we think we are computing the effect of some linear transforma- 
tion a € Hom(V, W), we are in fact computing that of an affine transformation 
vt> a(v) + y where y is a vector arising from computational or random errors 
which, hopefully, is “very small” (in some sense) relative to w(v). Similarly, in lin- 
ear models in statistics one must allow for such an affine transformation, where y is 
a random error vector, assumed to have expectation 0. 


Example Let V = C(O, 1) and let W be the subspace of V composed of all dif- 
ferentiable functions having a continuous derivative. Let 5: W — V be the linear 
transformation which assigns to each function f € W its derivative. Then ker(é) 
consists of all constant functions. If g € im(d) then g = 5(f), where f is the 
function fi: xb ie g(t) dt. Thus 6—!(g) consists of all functions of the form 
fix fj g(t) dt +c, where ce R. 


Proposition 6.7 [f a: V — W is an isomorphism of vector spaces over a 
field F then there exists an isomorphism B: W — V satisfying Ba(v) = v 
and aB(w) = w forallv eV andallwe W. 


6 Linear Transformations 97 


Proof Define the function 6 by 6(w) = v if and only if w = a(v). This function is 
well-defined since every element w is of the form a(v) for a unique element v € V. 
It is easy to check that the function 6 is an isomorphism which satisfies the stated 
conditions. 


The function 6 defined in Proposition 6.7 is denoted by a~!. 


Let V and W be vector spaces over a field F’. If there exists an isomorphism from 
V to W, we say that V and W are isomorphic and write V = W. It is easy to see 
that if V, W, and Y are vector spaces over F then: 
(1) VE=V; 
(2) If V=W then W=V; 
(3) If V=WandW=Y thenV=Y. 
It is also clear that if « : V — W is an isomorphism between vector spaces over 
F and if B is a basis of V then {a(u) | u € B} is a basis of W. As an immediate 
consequence of this, we see that if V = W then the dimensions of V and W are the 
same. The converse is true if V and W are finitely generated, as we shall now see. 


Proposition 6.8 Let V and W be vector spaces over a field F having bases 
B and D, respectively, and assume that there exists a bijective function 
f:B— D.ThenV=W. 


Proof By Proposition 6.2, we know that there exists a linear transformation 
a € Hom(V, W) satisfying the condition a(v) = f(v) for all v € B. This lin- 
ear transformation is epic since im(@) contains a basis of W. If v’ = evep av 
(where only finitely-many of the coefficients a, are nonzero) belongs to ker(@) then 
Ow =a(v') =a() ep QY) = Doyep WO(V) = rep Gy f (v) and so a, = 0 for all 
v € B, since D is linearly independent. Therefore, ker(q@) is trivial, and this shows 
that w is monic and hence an isomorphism. 


In particular, if V and W are vector spaces of the same finite dimension n over a 
field F, then V = W. 


Proposition 6.9 If V and W are vector spaces finitely generated over a field 

F, then 

(1) There exists a monomorphism from V to W if and only if dim(V) < 
dim(W); 

(2) There exists an epimorphism from V to W if and only if dim(V) = 
dim(W). 


Proof (1) If there exists a monomorphism a from V to W then V = im(q@) and 
so dim(W) > dim(im(a)) = dim(V). Conversely, assume that dim(V) < dim(W). 
Then there exists a basis B = {v1,...,U,} of V and there exists a basis D = 
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{w1,...,W;} of W, where n < t. The function from B to W given by vu; + w; 
for all 1 <i <n can be extended to a linear transformation a : V — W, which is 
monic and so is a monomorphism. 

(2) If there exists an epimorphism a from V to W and if {v,,..., v,} is a basis 
of V, then {a(v;) | 1 <i <n} is a generating set of W and so the dimension of 
W is at most n = dim(V). Conversely, if n = dim(V) > dim(W) = f¢, pick a basis 
{w1,..., w;} of W anda basis B = {v1,..., v,} of V. Define a function f : B—> W 
by 


w; forl <i <t, 
w, fort<i<n. 


finn | 


From Proposition 6.2, it follows that there exists a linear transformation a: V — W 
satisfying a(v;) = f(v;) for all | <i <n, and this is the desired epimorphism. 


Proposition 6.10 Let V and W be vector spaces over a field F, where V is 
finitely generated. Then dim(V) = dim(im(a)) + dim(ker(@)) for any linear 
transformation a € Hom(V, W). 


Proof Let a € Hom(V, W). Set V; = ker(@) and let V2 be a complement of V; 
in V. By Proposition 5.16, we see that dim(V) = dim(V;) + dim(V2) and so it 
suffices for us to show that V2 = im(q@). Let a2 be the restriction of a to Vo. 
Then a2 € Hom(V2, im(q@)). If v2 € ker(a2) then v2 € V2N Vi = {Ov}. Thus a2 
is a monomorphism. If w € im(q@) then there exists an element v of V satisfying 
a(v) = w. Moreover, v = vj + v2 for some vj € Vj and v2 € V2 so w= a(v) = 
a(v}) +a(v2) = Ow + a(v2) = a(v2) = a2(v2). Therefore, im(@2) = im(@), show- 
ing that a2 is also an epimorphism and hence the desired isomorphism. 


Let V and W be vector spaces over a field F. If a €e Hom(V, W) then we define 
the rank rk(a) of @ to be dim(im(q@)) and define is the nullity null(a) of a to be 
dim(ker(@)). Thus, Proposition 6.10 says that V has finite dimension n then both 
the rank and nullity of @ are finite and their sum is n. The converse is also clearly 
true: if the rank and nullity of @ are both finite, then the dimension of V is finite. Let 
us give bounds on the rank and nullity of compositions of linear transformations. 


Proposition 6.11 (Sylvester’s Theorem) Let V, W, and Y be vector spaces 
finitely-generated over a field F and leta: V > W and B: W — Y be linear 
transformations. Then 

(1) null(Sq@) < null(@) + null(6); 

(2) rk(a) + rk(B) — dim(W) < rk(Ba) < min{rk(@), rk(B)}. 
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Proof (1) Let 6; be the restriction of 6 to im(a). Then ker(61) is a subspace of 
ker(8). By Proposition 6.10, we have 


null(Ba) = dim(V) — rk(6a) = [dim(V) _ rk(@)] + [rk(a) _ tk(Bq) | 
= null(@) + null(61) < null(@) + null(6). 


(2) Clearly, im(8a@) is a subspace of im() and so its dimension is no greater than 
that of im(8). Moreover, im(6a@) = im(f;) and so rk(Ba) < rk(a@). Thus rk(Ba) < 
min{rk(q@), rk(B)}. Moreover, from (1) we see that 


dim(V) — null(Ba) > dim(V) — null(a) + dim(W) — null() — dim(W) 
= rk(a) + rk(B) — dim(W), 


and this proves that rk(av) + rk(6) — dim(W) < rk(Bqa). 
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Exercise 233 

Which of the following statements are true for all vector spaces V and W over a 
field F and all a e Hom(V, W)? 

(1) a(AU B) =a(A) Ua(B) for all nonempty subsets A and B of V; 

(2) a(AN B) =a(A)Na(B) for all nonempty subsets A and B of V; 

(3) a-!(CU D) =a7!(C) Va !(D) for all nonempty subsets C and D of W; 

(4) a-!(CN D) =a7!(C)Na7!(D) for all nonempty subsets C and D of W. 


Exercise 234 


1 -1 
Let a : R? > R? be a linear transformation satisfying a 0 = 3], 
1 4 
1 0 1 3 1 
a —1 =|1],anda 2 =| 1]. Whatisa 0 ? 
1 0 -1 4 0 
Exercise 235 
1 1 
Let a : R? — R? be a linear transformation satisfying a 1 = ; 
0 -1 


1 0 3 
a 0 =]1],anda —1 = | 3 |. Finda vector v € R? for which 
3 
1 
0 
0 
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Exercise 236 
Let F be a field and let V be the subspace of F[X] consisting of all polynomials 
of degree at most 2. Let a: V > F[X] be a linear transformation satisfying 
a(1) = X,a(X +1) = X54 X3, and a(X2 + X +1) = X4 — X2 +1. What is 
a(X2 — X)? 


Exercise 237 
For each d € R, let wg : R? > R? be the function defined by 


[a a+b+d*+1 
Ag: b b> a 7 


Is there a number d having the property that ag is a linear transformation? What 
if we consider ag as a function from GF Gy to itself? 


Exercise 238 
For each d € R, let ay : R? > R? be the function defined by 


yell | gee 5da — db 
Bia Ue 8d? — 8d —6 |" 
Is there a number d having the property that wg is a linear transformation? 


Exercise 239 

Let V and W be vector spaces over Q and let a: V > W bea function satisfying 
a(v+v’)=a(v) + a(v’) for all v, v’ € V. Is @ necessarily a linear transforma- 
tion? 


Exercise 240 
Let a : RR > R be a continuous function which satisfies w(a + b) = a(a) + a(b) 
for all a, b € R. Show that q is a linear transformation. 


Exercise 241 

Let W and W’ be subspaces of a vector space V over a field F and assume 
that we have linear transformations a : W — V and £: W’ = YV satisfying the 
condition that a(v) = B(v) for all ve WOW’. Find a linear transformation 
06:W+W’' = V, the restriction of which to W equals @ and the restriction of 
which to W’ equals 8, or show why no such linear transformation exists. 


Exercise 242 
Let F = GF(3) and let 6 € F* be the function defined by 6(0) = 0, 0C1) = 2, 
and 6(2) = 1. Let n be a positive integer and let a : F” + F” be the function 
a (a1) 
defined bya: ] = |t : . Is w a linear transformation? 
an O(an) 
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Exercise 243 
Does there exist a linear transformation a : Q* > Q[X] satisfying 


1 —1 —1 

2 1 4 

—_— — — 9 

0 2, a 1 X, and a 2 X+1? 
—1 1 1 


Exercise 244 
Let B be a Hamel basis for R as a vector space over Q and let 1 4a € R. Show 
that there exists an element y € B satisfying ay ¢ B. 


Exercise 245 
For which nonnegative integers h is the function a from GF(3)? to itself defined 


a a! 
bya:| b|t>] 5b | a linear transformation? 
h 
c Cc 


Exercise 246 
For any field F, let 0: F — F be the function defined by 


0 ifa=0, 

O:atry _4 : 

a otherwise. 

This is clearly a linear transformation when F = GF(2). Does there exist a field 
other than GF(2) for which @ is a linear transformation? 


Exercise 247 

Let V = F™ and let a: V > V be the function that assigns to each se- 
quence [a),a2,...] € V its sequence of partial sums, namely [a),a2,...] b 
[a1, aa di, ae a;,...]. Is a a linear transformation? 


Exercise 248 
Let Y = R® x R. Is the function a : Y > R defined bya: (f,a)b f(@ alinear 
transformation? 


Exercise 249 
Let F be a field and let b and c be nonzero elements of F. Leta: F® > F™ be 
the linear transformation defined by 


a:[aj,a2,...J-> [a3 + ban + ca,,a4 + baz + caz2,...]. 


Let y € ker(@) be a vector satisfying the condition that two successive entries in 
y equal 0. Show that y = [0, 0,...]. 
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Exercise 250 
Consider the field F = Q(/2) as a Q-algebra. Show that the only homomor- 


phisms of Q-algebras from F to itself are the identity function and the function 


atbJ/2~ a—bvV2. 


Exercise 251 

Let V, W, and Y be vector spaces finitely-generated over a field F and let 
a:V— W be a linear transformation. Show that the set of all linear transfor- 
mations 6 : W — Y satisfying the condition that Ba is the 0-transformation is a 
subspace of Hom(W, Y), and calculate its dimension. 


Exercise 252 

Let V and W be vector spaces over a field F and let V’ be a proper subspace 
of V. Are {a € Hom(V, W) | ker(a) € V’} and {a € Hom(V, W) | ker(w) D> V’} 
subspaces of Hom(V, W)? 


Exercise 253 

Let V and W be vector spaces over a field F and assume that there are sub- 
spaces V; and V2 of V, both of positive dimension, satisfying V = V; © V2. For 
i= 1,2, let Uj = {a € Hom(V, W) | ker(a) D V;}. Show that {U;, U2} is an inde- 
pendent set of subspaces of Hom(V, W). Is it necessarily true that Hom(V, W) = 
U; @U2? 


Exercise 254 
Let F be a field, and let a : M2x2(F) > Mnpyxn(F) be a homomorphism of 


F-algebras for some n > 1. Show that a ({ 01) Al, 


Exercise 255 

Let V and W be vector spaces over a field F and let a, 8 : V — W be linear 
transformations satisfying the condition that for each uv € V there exists a scalar 
Cy € F (depending on v) satisfying B(v) = cya(v). Show that there exists a scalar 


c satisfying B = ca. 


Exercise 256 


Let V and W be vector spaces over a field F. Define a function g : Hom(V, W) > 


Hom(V x W, V x W) by setting g(a) : | b> me 


mation of vector spaces over F’? Is it a monomorphism? 


| Is ¢ a linear transfor- 
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Exercise 257 
Find the kernel of the linear transformation a : R° > R? defined by 


a 
b b+c—2d+e 

a clre|a+2b+3c—4d 
d 2a + 2c — 2e 
e 


Exercise 258 
Let a : R? — R? be the linear transformation defined by 


a 2a+4b—c 
a:|}b|pe 0) 
Cc 3c+2b—a 


Are im(q@) and ker(q) disjoint? 


Exercise 259 


2 1 1 

—1 0 1 

Let W be the subspace Q ol-lol-lo 
1 1 1 


the linear transformation defined by setting a : a+2b+e 


—a—2b—-—c 


| of Q* and let a: W > Q? be 
a 

: |: Fina a 
c 

d 


basis for ker(q@). 


Exercise 260 

Let F = GF(3) and let a : F? + F®? be the linear transformation defined by 
a a+b 

a:|b |r| 2b+c |. Find the kernel of a. 
c 0 


Exercise 261 
Let a : R*+ — R? be the linear transformation defined by 


- 2a+4b+c-—d 
Qa: > 3a+b—2c 
7 a+5c+4d 


Do there exist a, b, d € Z such that € ker(a)? 


QxAS8 
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Exercise 262 

Let V and W be vector spaces over a field F. Let a € Hom(V, W) and 
B € Hom(W, V) satisfy the condition that aBa = a. If w € im(a@), show that 
a7!(w) = {B(w) + v — Ba(v) | v € V}. 


Exercise 263 

Let V, W, and Y be vector spaces over a field F and let a e Hom(V, W) and 
B € Hom(W, Y) satisfy the condition that im(q) has a finitely-generated comple- 
ment in W and im(f) has a finitely-generated complement in Y. Does im(6a) 
necessarily have a finitely-generated complement in Y? 


Exercise 264 
Let a : M3,3(R) > R be defined by a : [a;;] Sa a a;;. Show that a 
is a linear transformation and find a basis for ker(q@). 


Exercise 265 


Let F = GF(2) and let n > 2 be an integer. Let W be the set of all vectors 


an 
in F” having an even number of nonzero entries. Show that W is a subspace of 


F” by showing that it is the kernel of some linear transformation. 


Exercise 266 

Let A and B be nonempty sets. Let V be the collection of all subsets of A and 
let W be the collection of all subsets of B, both of which are vector spaces 
over GF(2). Any function f : A > B defines a function af : W — V by set- 
tingay¢: Dt {ae A| f(a) € D}. Show that each such function a ¢ is a linear 
transformation, and find its kernel. 


Exercise 267 
Let V be a vector space over a field F and let a : V* —> V be the function defined 


U1 

by a: ] v2 | > vj + v2 + v3. Show that @ is a linear transformation and find its 
U3 

kernel. 

Exercise 268 


Let n be a positive integer and let V be the subspace of R[X] composed of all 
polynomials of degree at most n. Let a: V — V be the linear transformation 
given by a: p(X) p(X + 1) — p(X). Find ker(@) and im(a@). 
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Exercise 269 
Let a : R? — R? be the linear transformation given by 


a a+b+c 
a:|blwe —a-c 
c b 


Find ker(@) and im(q@). 


Exercise 270 
Find the kernel of the linear transformation a : Q[X] — R defined by a: 
p(X) > p(V3). 


Exercise 271 
Let V = C(0, 1). For each positive integer n, we define the nth Bernstein function 
Bn: V+ R[X] by 


= n! k\ os oh 
iif Yaga (Ge ae 


Show that each 6, is a linear transformation and find (4 ker(B,,). (Note: the 
Bernstein functions are used in building polynomial approximations to continu- 
ous functions.) 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Sergei Natanovich Bernstein was a twentieth-century Russian math- 
ematician who worked mostly in probability theory. 


Exercise 272 
Let V and W be nontrivial vector spaces over a field F. Show that W = 
>" {im(@) | @ € Hom(V, W)}. 


Exercise 273 

Let W be the subspace of R® consisting of all twice-differentiable functions and 
let a: W > R® be the linear transformation a: f +> f”. Find w~!(fo), where 
fo € R® is defined by fo: xt x+ 1. 


Exercise 274 

Let W be the subspace of R™ consisting of all differentiable functions and let 
a: W — R® be the function defined by a(f): xh f’(x) + cos(x) f (x). Show 
that q@ is a linear transformation and find its kernel. 
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Exercise 275 

Let n be a positive integer and let V be a vector space over C. Does there ex- 
ist a linear transformation a : V — C” other than the 0-function satisfying the 
condition that im(a) C R”? 


Exercise 276 

Let V and W be vector spaces over a field F and let a: V > W be a linear 
transformation other than the 0-function. Find a linear transformation 6 : V > W 
satisfying im(@) =im(6) ~im(a@ + B). 


Exercise 277 

Let V be a finite-dimensional vector space over a field F and let a,f € 
Hom(V, V) be linear transformations satisfying im(a) + im(6) = V = ker(@) + 
ker(8). Show that im(@) Nim(f) = {Oy} = ker(@) N ker(). 


Exercise 278 

Let V, W, and Y be vector spaces over a field F and let a e Hom(V, W) and 
B € Hom(W, Y) satisfy the condition that ker(@) and ker(8) are both finitely 
generated. Is ker(Ba) necessarily finitely generated? 


Exercise 279 
Find a linear transformation a : Q? > Q* satisfying 


0.5 2 

ed 1 
im(a) =Q) 3 |: I 
Oo} | -4 


Exercise 280 
Let F = GF(2) and let a € Hom(F’, F*) be given by 


a a4+a5+a6+a47 
> | | a7a+a3+a6+47 
a a) + a3 445 +47 


If v is a nonzero element of ker(@), show that at least three entries in v are equal 
to 1. 


Exercise 281 

Let V and W be vector spaces finitely-generated over a field F and let 
a€Hom(V, W). If Y is a subspace of W, is it true that dim(a~!(Y)) > 
dim(V) — dim(W) + dim(Y)? 


Exercise 282 
Let V be a vector space over a field F and let Y = V™. Let W be the subspace 
of Y consisting of all those sequences [v1, v2,...] in which v; = 0 for all odd i 
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and let W’ be the subspace of Y consisting of all those sequences in which v; = 0 
for all even 7. Find a linear transformation from Y to itself, the kernel of which 
equals W and the image of which equals W’. 


Exercise 283 
a} 


Let W be the subspace of R° composed of all vectors : satisfying 
a6 
oy a; = 0. Does there exist a monomorphism from W to R*? 


Exercise 284 

Let n be a positive integer and let a : Q” > Q” be a linear transformation which 
is not a monomorphism. Does there necessarily exist a nonzero element of ker(@) 
all the entries of which are integers? 


Exercise 285 
Let n be a positive integer and let W be the subspace of C[X] consisting of all 


polynomials of degree less than n. Let aj,..., a, be distinct complex numbers 
p(a1) 

and let a: W > C” be the function defined by a : p(X) : -Isaa 
P(an) 


monomorphism? Is it an isomorphism? 


Exercise 286 

Let V be a vector space over a field F and let a: V — V be a linear transfor- 
mation satisfying the condition that «7 = aw + bo, where a and b are nonzero 
scalars. Show that a is a monomorphism. 


Exercise 287 

Let p be a prime integer and let F be a field of characteristic p. Let (K,¢) be an 
associative and commutative unital F -algebra and leta : K — K be the function 
defined by a: vt» v?. Show that a is an isomorphism of unital F'-algebras. 


Exercise 288 
Let F be a field and let K and K’ be fields containing F’. Show that every homo- 
morphism of F-algebras K — K’ is ahomomorphism of unital F'-algebras. 


Exercise 289 
Let F = GF(7). How many distinct monomorphisms can one define from F? 
to F4? 


Exercise 290 
Let V and W be vector spaces over a field F and let a, 8 € Hom(V, W) be 
monomorphisms. Is @ + 6 necessarily a monomorphism? 
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Exercise 291 

Let F be a field and let F’ be a field containing F. Let (K, e) be an F-algebra 
and let a: F’ > K bea nontrivial homomorphism of F-algebras. Show that a 
is monic. 


Exercise 292 

Let V and W be vector spaces over a field F and let a e Hom(V, W) be an 
epimorphism. Show that there exists a linear transformation B € Hom(W, V) 
satisfying the condition that wf is the identity function on W. 


Exercise 293 

Let V, W, and Y be vector spaces over a field F and let a e Hom(V, W) be 
an epimorphism. Show that for each linear transformation B € Hom(Y, W) there 
exists a linear transformation 0 € Hom(Y, V) such that 6 = a. 


Exercise 294 

Let V, W, and Y be vector spaces over a field F and let a e Hom(V, W) be a 
monomorphism. Show that for each linear transformation 6 € Hom(V, Y) there 
exists a linear transformation 0 € Hom(W, Y) such that 6 = 0a. 


Exercise 295 

Let V be a vector space finitely-generated over a field F, the dimension of which 
is even. Show that there exists an isomorphism a : V —> V satisfying the condi- 
tion that w?(v) = —v for all ve V. 


Exercise 296 

Let a: V > W be a linear transformation between vector spaces over a field F 
and let D be a nonempty linearly-independent subset of im(@). Show that there 
exists a basis B of V satisfying the condition that {a(v) | ve B} = D. 


Exercise 297 

Let V and W be vector spaces over a field F and let a € Hom(V, W) satisfy the 
condition that afqa is not the 0-function for any linear transformation 6 : W > V 
which is not the 0-function. Show that @ is an isomorphism. 


Exercise 298 

Let F be a field and let a : F? > F[X] be the linear transformation defined by 
a 
b | (a+b)X + (a+c)X°. Find the nullity and rank of a. 
c 


Exercise 299 
Let F bea field and let p(X) = X* +X +c € F[X] bea polynomial having dis- 
tinct nonzero roots d) and d in F. Let a: F? — F be the linear transformation 
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a 
defined by aw: | az | +> a3+baz+ca, and let B: F° > F be the linear trans- 
a3 
a| a2 
formation defined by 6 : [a1,a2,a3,...]bK> | a@ a2 5a a “oan 
a3 a4 


Show that the nullity of 6 is at least 2. 


Exercise 300 
Let 2 be a nonempty set and let V be the collection of all subsets of 2, con- 
sidered as a vector space over GF(2). Show that this vector space is isomorphic 
to GF(2)%. 


Exercise 301 

Let F be a field and let V be the subspace of F™ consisting of all sequences 
[a1, 42, 43,...] in which a; = 0 for all even i. Let W be the subspace of F'° 
consisting of all sequences [a1, a2, a3, ...] in which a; = 0 for all odd i. Show 
that V= FP =W. 


Exercise 302 
Let V be a vector space over a field F having subspaces W and W’. Let 


Y= {| | weWandw’e w'|, which i a subspace of V7. Leta: ¥Y—>V 


be the linear transformation defined by a : a +> w+w’. Find the kernel of a, 


and show that it is isomorphic to WM W’. 


Exercise 303 

Let V be a vector space over a field F. Let W be a subspace of V and let W’ be 
a complement of W in V. Let a: W > W’ be a linear transformation. Show that 
W isomorphic to the subspace Y = {w + a(w) | w € W} of V. 


Exercise 304 
Show that there is no vector space over any field F having precisely 15 elements. 


Exercise 305 
Let F be a field and let V = F[X]. Show that V & V?. 


Exercise 306 

Let V, W, and Y be vector spaces over a field F’. Let {a1,..., &,} be a finite sub- 
set of Hom(V, W) and let 6 € Hom(V, Y) be a linear transformation satisfying 
(yet ker(a@;) C ker(6). Show that there exist linear transformations 71, ..., Yn 
in Hom(W, Y) satisfying B = 7", via. 
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Exercise 307 

Let V be a vector space over a field F and let W be a subspace of V. For each 

veV,letv+W={v+w|we W}. Let V/W be the collection of all the 

sets of the form v + W for v € V and define operations of addition and scalar 

multiplication on V/W by setting (v + W) + (v' + W) = (v + v’) + W and 

c(u+ W) = (cv) + W for all v, v’ € V andc € F. Show that: 

(1) v+W=v0'+ Wifand only ifv—v' € W; 

(2) V/W, with the given operations, is a vector space over F; 

(3) The function v +> v + W is an epimorphism from V to W, the kernel of 
which equals W; 

(4) Every complement of W in V is isomorphic to V/W; 

(5) Iff(u+W]N[v'+ W] 42 thenv+ W=0'4+W. 

The space V/ W is called the factor space of V by W. 


Exercise 308 

Let F be a field and let m > n be positive integers. Let A and B be fixed matrices 
in Myxm(F) and let 0: Minxn(F) > Mnxm(F) be the linear transformation 
defined by 6 : Ct» ACB. Show that @ is not an isomorphism. 


Exercise 309 

Let F be a field and let K and L be fields containing F as a subfield. Show 
that the set of homomorphisms of unital F'-algebras from L to K is a linearly- 
independent subset of the vector space K“ over K. 


Exercise 310 

Let V and W be vector spaces over a field F’, with V finitely generated, and let 
Y be a proper subspace of V. Let a €e Hom(V, W) and let £ be the restriction of 
a to Y. Show that either ker(8) C ker(@) or im(6) C im(@). 


Exercise 311 

Let V bea vector space over a field F’, and let U C W be subspaces of V. Assume 
that there exist x, y € V satisfying the condition that the affine sets x + U = 
{x +u|ueU} and y+ W={y+w|w € W} have a vector in common. Show 
thatx+U Cy+W. 


Exercise 312 

Let V and W be vector spaces over a field F’. A function f : V — W is linearly 

independent if and only if gr(f) is a linearly-independent subset of V x W. 

(1) Show that if f : V — W is linearly independent and if a e Hom(V, W) then 
f +a is linearly independent. 

(2) Show that no linear transformation is linearly independent. 


Exercise 313 

Let V and W be a vector spaces over a field F’. A linear transformation a : V > 
W is said to have algebraic degree n if and only if the set {v, a(v),...,a@”(v)} 
is linearly dependent for any v € V, but there exists an element vp of V such 


Exercises 111 


that the set {vp, w(v9),..., a"! (vp)} is linearly independent. Find the algebraic 
a a+2b 
b b-c 
degree of w € Hom(R°, R°) defined bya: | c |b a 
d c-—a 
e Cc 


Exercise 314 

Let n be a positive integer and let F be a field the characteristic of which does not 
divide n. Let W be the subspace of My x»(F) generated by {AB — BA|A,Be 
Maxn(F)}. Show that dim(W) =n? — 1. 


Exercise 315 
Let V be a vector space finitely generated over a field F and let a ¢ Hom(V, V). 
Show that there exists a positive integer f satisfying V = im(a‘) @ ker(a’). 


The Endomorphism Algebra of a Vector Space 7 


Let V be a vector space over a field F. A linear transformation a from V to itself 
is called an endomorphism of V. We will denote the set of all endomorphisms of 
V by End(V). This set is nonempty, since it includes the functions of the form 
Oc: ut cv for c € F. In particular, it includes the 0-endomorphism oo : v > Oy 
and the identity endomorphism 0; : vt v. If V is nontrivial, these functions are 
not the same. We see that we have two operations defined on End(V): addition 
and multiplication (given by composition). Indeed, as a direct consequence of the 
definitions we conclude the following: 


Proposition 7.1 If V is a nontrivial vector space over a field F , then End(V) 
is an associative unital F -algebra with og being the identity element for ad- 
dition and 0 being the identity element for multiplication. 


If V is a nontrivial vector space over a field F then there exists a function 
o : F + End(V) defined by o : ch o¢ for all c € F. This function is monic, for 
if o¢ = og then for any Ov 4 v € V we have cv = o,(v) = og(v) = dv and hence 
(c — d)v = Oy. Since Oy ¥ v, this implies that c — d = 0 and so c = d. Moreover, 
if c,d € F then o¢ + og = Oc4q and 0-0g = Ocd SO o iS a Monic homomorphism 
of unital F'-algebras. We can use this function to identify F with its image under o 
and consider it a subalgebra of the F-algebra End(V). 

Ifa, 6 € End(V) andifc € F, then we have already seen that the functions a+ 6, 
aB, and ca all belong to End(V). Therefore, we see that if p(X) = eo ajX'¢€ 
F[X] then p(a) = )7"_)a;a! is an endomorphism of V, and, indeed, the set F[a] 
of all endomorphisms of V of this form is an F-subalgebra of End(V). The func- 
tion from F[X] to F[a] given by p(X) +> p(q@) is immediately seen to be an epic 
homomorphism of unital F-algebras for any a € End(V). 


Example Let F = GF(2) and let p(X) = X?4-X € F[X]. Then p(a) = 0 for every 
a € F. However, p(a) 4 00, where a € End(F7) is defined by a: Bi > Hi 
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Example Structures of the form F[a] are important in many areas of mathematics. 
For example, let V be the collection of all infinitely-differentiable functions from R 
to R and let 5 be the differentiation endomorphism on V. If p(X) = -"_) aiX! € 
R[X], then we have p(6): fre anf +L af fll, where f!! denotes the ith 
derivative of f. Such an endomorphism is called a differential operator with con- 
stant coefficients on V. If c € R and if f. € V is the function given by f.: xh e™ 
then 6(f.) = cf. and so p(S): fe b> y —0 4c oe = ner: = Plc) fe. 
Thus, p(6) is the 0-function whenever c is as root of p(X). Hence f, € ker(p(6)) 
for each root c of p(X). 


Example Let V be the convolution algebra on R and let h € V be the constant 
function t +> 1. Then h defines an endomorphism of V given by ft hx f, called 
the integration endomorphism since h * f : t te fo flu) du. 


Example Let F be a field and let (K,e) be a nonassociative F-algebra. An en- 
domorphism 6 € End(K) is a derivation if and only if (ve w) = [d(v)]ew+ve 
[5(w)]. Thus, for example, if K is a Lie algebra then, as a consequence of the Jacobi 
identity, we see that every y € K defines a derivation 46, of K given by 6): vt> yeu. 
Also, if K is the R-algebra consisting of all infinitely-differentiable functions in R®, 
then the endomorphism of K which assigns to each function in K its derivative is 
a derivation. The set of all derivations defined on K is a subspace of End(XK). If 6 
and 5’ are derivations on K, then 56’ is not, in general, a derivation on K, but the 
Lie product 56’ — 6’6 is always a derivation on K, and so the set of all derivations 
on K is a Lie algebra over F. 


Given a nontrivial vector space V over a field F’, we note that the F-algebra 
End(V) is neither necessarily commutative nor necessarily entire, as the following 


examples show: 


Example Let F bea field and let V = F*. Let a, 8 € End(V) be the endomorphisms 


a b a a a b 
defined bya:| b|h|ajandB:}b|t]01].ThenBa:| b |r |] O | and 
c c c 0 c 0 
a 0 
ap:| blr |a|,sopaxap. 
c 0 


Example Let F bea field and let V = F?. Leta, 6 € End(V) be the endomorphisms 


a 0 a a 
defined bya: | b}t> |] 0] andf:| b}|} | 0 |}. Then Ba =o9 =a. 
c c Cc 0) 


We do, however, have the following: 
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Proposition 7.2 Let V be a vector space over a field F. Then for all 
a € End(V) and all c € F we have ao, = o-a. 


Proof If v € V then ao, (v) = a(cv) =ca(v) = o,-a(v). 


An endomorphism of a vector space V over a field F which is also an isomor- 
phism (i.e., which is both monic and epic) is called an automorphism of V. Since 
a(Oy) = Oy for any endomorphism a@ of V, we see that any automorphism of V in- 
duces a permutation of V ~ {Oy}. Similarly, a homomorphism of F-algebras which 
is also an isomorphism is an automorphism of F-algebras. 

By what we have already seen, we know that a € End(V) is an automorphism 
if and only if there exists an endomorphism a! € End(V) satisfying wa~! = 
o; =a~!a. We will denote the set of all automorphisms of V by Aut(V). This set 
is nonempty, since 0; € Aut(V), where o, _ 01. Moreover, if a, 6 € Aut(V) then 
(aB)(B~!a~!) = a(BB-!)a-! = aa-! = oj and similarly (8~!a~!)(@p) = o4. 
Thus wf € Aut(V), with (aB)~! = B-'a™!. It is also clear that if a € Aut(V) 
then a~! € Aut(V). If a € Aut(V) and 0 #ceé F, then ca € Aut(V) and 
(ca)-t =c!a7!. 


Example Let V be a vector space over a field F and let n > 1 be an integer. Any 


permutation z of the set {1,...,} defines an automorphism a, of V” given by 
On Ux (1) 
v2 Ux (2) : i ; 

On! |e . which rearranges the entries of each vector according to 
Un Un (n) 


the permutation zr. More generally, if V is a vector space over a field F having a 
basis B = {v; | i € 82} and if z is a permutation of 2, then there is an automorphism 
of V defined by )0 5-4 divi > oj, Gn(iy Umi) for each finite subset A of 2. 


Example Let F be a field and let n be a positive integer. We have already seen 
that the function Ate A? is an automorphism of My x(F), considered as a vector 
space over F’. 


Example Let V be a vector space having finite dimension n over a field F and 
let v and y be nonzero elements of V. Then there exist bases {v,,...,v,} and 
{y1,---, Yn} of V satisfying vj = v and y; = y. The function a: V > V defined 
by a: 07, ajvj H Y2"_, a; y; is thus an automorphism of V satisfying a(v) = y. 


Let V be a vector space over a field F and let n be a positive integer. We will list 
several types automorphisms, called elementary automorphisms, of a vector space 
of the form V”. These automorphisms will play an important part in our ensuing 
discussion. 
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(1) If 1 <hFk <n, we define ¢,, € Aut(V”) by 


a Wy vy iwfi=h 
ae ea , Wwherewj=4u, ifi=k, 

; Otherwise. 
Un Wn vj 


This automorphism satisfies es = Enk. 
(2) If 1 <h <n, andif04c€ F, we define én.- € Aut(V”) by 


vi Ww) 
cui ifi=h, 


nea > |, where w; = : 
, vj otherwise. 


Un Wn 


This automorphism satisfies oe = Ep col. 
(3) If1<h#k <n andif ce F, we define enx.¢ € Aut(V”) by 


Ui WI ; 
vietcu, ifi=h, 


be : where w; = ‘ 
. |? : | U; otherwise. 


Un Wn 
This automorphism satisfies oe oe = Ehk-c- 
Identifying the automorphisms of a finite-dimensional vector space V over a field 


F is a problem which will be of major importance to us later, and so it is important 
to characterize these functions. 


Proposition 7.3 Let V be a vector space of finite dimension n over a field F .. 
Then the following conditions on an endomorphism a of V are equivalent: 
(1) @ is an automorphism of V; 

(2) @ is monic; 

(3) @ is epic. 


Proof By definition, (1) implies (2). Now assume (2). By Proposition 6.10, we see 
that the rank of a equals n and so im(a) = V by Proposition 5.11, proving (3). 
Now assume (3). By Proposition 6.10, we see that the nullity of w equals n —-n =0 
and so ker(a) = {Oy}, proving that @ is monic as well, and so is bijective. This 
proves (1). 


Proposition 7.4 Let V be a finite-dimensional vector space over a field F and 
let a € End(V). If there exists a B € End(V) satisfying aB = 0; or Ba =o}, 
then a € Aut(V) and B =a7!. 
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Proof If Ba = o; then ker(@) C ker(o,) = {Oy} and so, by Proposition 7.3, 
a € Aut(V). Similarly, if #6 = 0; then im(@) > im(o;) = V and so, by Proposi- 
tion 7.3, a € Aut(V). Moreover, if #8 = o; we see that a~!=a-!o, =a7! (ap) = 
B and similarly a~! = B when Ba = 0}. 


Example Proposition 7.3 and Proposition 7.4 are no longer true if we remove the 
condition of finite dimensionality. For example, let F be a field and let V = F[X]. 
Define the endomorphisms a and of V by setting a : )v""_,a;X! > o"_g a; X'*! 
and B: )\/_jaiX't> )-7_, a;X'!. Then a, B ¢ Aut(V), despite the fact that a is 
monic and 6 is epic. Moreover, Ba = 0; but a6 £0). 


Let V be a vector space over a field F and let a € End(V). A subspace W of V 
is invariant under a if and only if a@(w) € W for all w € W or, in other words, if and 
only if a@(W) C W. Thus, W is invariant under a if and only if the restriction of a 
to W is an endomorphism of W. It is clear that V and {Oy} are both invariant under 
every endomorphism of V. If @ € End(V) then im(q@) and ker(q@) are both invariant 
under @. 


Example Let F be a field and, for each positive integer k, let Wy be the sub- 
space of F[X] composed of all polynomials of degree at most k. Let 6 be the for- 
mal differentiation endomorphism of FX], namely the endomorphism defined by 
5: 9a; X! + Yo"_pia;X'—!. Then each of the subspaces W, is invariant un- 
der 5. Now assume that F is of characteristic 0. If p(X = 79 aX! € Wy and if 
a€ F thenit - easy to check that p(X) = p(a)+ ie 1 al [s" (p) (a) | (X — a)". The 
coefficients 7; a (p)(a)] are known as the Taylor coefficients of p(X) around a. 


Example Let V = R* and let a be the automorphism of V defined by 


a: A a Ee Let W be a proper subspace of V which is invariant under a. 


Then dim(W) < 1 and so there exists a vector w = | satisfying W = Rw. Since 


a(w) = Lah it follows that there exists a real number e such that a(w) = ew. 

That is to say, ec = d and ed = —c. From this we learn that ce? = —c, and so 

c =d=0. This proves that W = {fo |. and so we see that V has no proper 

nontrivial subspaces invariant under a. 

Example Let F be a field and let 1 be a positive integer. Let a be the automorphism 
a an 


a2 al 
of F” definedbya:] . |b . . A subspace W invariant under a@ is cyclic. 


an Gn-1 


118 7 The Endomorphism Algebra of a Vector Space 


Cyclic subspaces of F”, where F is a finite field, are important in defining certain 
families of error-correcting codes. 


Let F be a field. An element a of an F-algebra (K, e) is idempotent if and only 
if a? =a. If V is a vector space over F, then an idempotent element of End(V) is 
called a projection. Note that if a € End(V) is a projection and if w = a(v) € im(@) 
then a(w) = a?(v) = a(v) = w, so that the restriction of « to its image is just o1. 
The converse is also true. If @ € End(V) satisfies the condition that the restriction of 
a to its image is just 01, then for each v € V we have a?(v) = a(a(v)) = oj (a(v)) = 
a(v) and so @ is a projection. 


Example If F is a field then the endomorphism of F? defined by 


a 3a —2c 
bilwR]-a+b+c 
c 3a — 2c 


is a projection. 


Example The sum of two projections need not be a projection. For example, if 
V =R> then the endomorphisms o and f of V defined by 


a a a 0 
a:|b|jJwr~lb and B:|b|rh~]b 
c 0 Cc Cc 


are projections, but a + 6 is not a projection. 


Example If W is a subspace of a vector space V over a field F having a complement 
Y in V, we know that every element v € V can be written in a unique way in the form 
w+ y, where w € W and y € Y. The endomorphism of V defined by v > w is a 
projection the image of which is W. Statisticians often consider data in V = R” and 
use a projection in End(V) to project it onto a subspace W of V that best preserves 
the variance in the data. This standard method in data analysis is called principle 
component analysis and there exist several efficient algorithms for performing it. 


In fact, all projections of a vector space are of the form in the previous example, 
as the following example shows. 


Proposition 7.5 Let V be a vector space over a field F and let a € End(V) 
be a projection. Then V = im(a@) © ker(q@). 


Proof If v € im(a) Mker(q@) then there exists an element y of V satisfying v = a(y) 
and so v = a(v) = Oy. Thus im(q@) and ker(q@) are disjoint. If v is an arbitrary 
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vector in V then v = [v —a(v)]+a(v) € ker(a) +im(q@). Therefore, V = im(a) ® 
ker(a@). 


Proposition 7.6 Let V be a vector space over a field F and let a € End(V). 
A subspace W of V is invariant under a if and only if Bap = af for each 
projection B of V the image of which is W. 


Proof Assume that W is invariant under @ and let 6 be a projection of V the im- 
age of which is W. By Proposition 7.5, we have V = W @ ker(f). If v € V, we 
can therefore write v = w + y, where w € W and y € ker(8). Hence awf(v) = 
ap(w)+aB(y) =a(w)+0y =a(w) = Ba(w) = BaB(v), showing that Bap = af. 
Conversely, if BafB = af for each projection 6 of V the image of which is W 
then, for each such B, we have w = 6(w) for all w € W and so a(w) = af(w) = 
BaB(w) € W, showing that W is invariant under a. 


Proposition 7.7 Let V be a vector space over a field F and let {W,,..., Wn} 

be a set of subspaces of V. Then the following conditions are equivalent: 

) V=W, @::-@ Wa; 

(2) There exist projections o,...,Q, in End(V) with W; = im(qa;) for 
all 1 <i <n, which satisfy the conditions aja; = 09 for i # j and 
A, +-++- +A, =O}. 


Proof (1) => (2): From (1) it follows that every v € V can be written in a unique 
manner as YL | Wi, Where w; € W; for all 1 <i <n. Define q; to be the projection 
u+> w; for each i. It is easy to verify that these linear transformations do indeed 
satisfy the required conditions. 

(2) = (1): Since a + --- + a, = 01, we surely have V = )-"_, im(@j) = 
Wi. If Oy AVE WAN eh W; then there exists an i 4h such that aj(v) # 
Oy. But a, (v) = v so aja, 4 O0, acontradiction. Therefore, W;, ae W; ={0vy} 
for each 1 < h <n, proving (1). 


Proposition 7.8 Any two complements of a subspace W of a vector space V 
over a field F are isomorphic. 


Proof Let U and Y be complements of W in V. By Proposition 7.7, we know that 
there exists a projection 6 € End(V) the image of which is U and the kernel of 
which is W. Let a be the restriction of 6 to Y. The linear transformation @ is a 
monomorphism since ker(@) C ker(B) NY = WM Y = {Ov}. Any vector u € U 
can be written as w + y, where w € W and y € Y, and we have a(y) = B(y) = 
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B(w) + BC”) = B(w + y) = B@) =u. Thus we see that a is also epic and hence is 
the desired isomorphism. 


We now introduce a notion which is basic in all branches of mathematics. A re- 
lation = defined on a given nonempty set U is called an equivalence relation if and 
only if the following conditions are satisfied: 

(1) u=u forallu ec U; 
(2) u=w' if and only if uv’ =u; 
(3) Ifu=w' andw’ =u” thenu =u". 


Example Let B be a nonempty subset of a set A and define a relation =g on A by 
setting a =, a’ if and only if a =a’ or both a and a’ belong to B. Then =, is an 
equivalence relation on A. In particular, if W is a subspace of a vector space V then 
the relation =w defined on V by setting v =w v’ if and only if v — v’ € W is an 
equivalence relation on V. 


Example Let V and W be a vector spaces over a field F and let a e Homf(V, W). 
Define a relation = on V by setting v = v’ if and only if w@(v) = a(v’). This is easily 
seen to be an equivalence relation. 


Let V be a vector space over a field F. A subset G of Aut(V) is a group of 
automorphisms if it is closed under taking products, contains 01, and satisfies the 
condition that a! € G whenever a € G. Clearly, Aut(V) itself is such a group. 
The notion of a group of automorphisms is very important in linear algebra and its 
applications, but here we will only touch on it. 


Example Let V be a vector space over a field F and let a € Aut(V). Then 
{a’ |i € Z} is surely a group of automorphisms. 


Example Let V be a vector space over a field F and let 92 be a nonempty set. Every 
permutation 2 of 2 defines an automorphism a of the vector space V° over F 
defined by ag (f):ite f(ar()) for alli € 2 andall f e€ V®. The collection G of 
all such automorphisms is a group of automorphisms in Aut(V®). 


Proposition 7.9 [f V is a vector space over a field F and if G is a group of 
automorphisms of V then G defines an equivalence relation ~G on V by set- 
ting v ~G v’ if and only if there exists an element a of G satisfying a(v) = v'. 


Proof If v € V then oj(v) = v, and so v ~g v. If v, v' € V satisfy v ~G v’ then 
there exists an element a of G satisfying a(v) = v’, and so v’ = a~!(v). Thus 
v' ~G v. Finally, if v, v’, v’ € V satisfy v ~G v’ and v’ ~G v” then there exist 
elements a and £ of G satisfying a(v) =v’ and B(v’) =v”, and so Ba(v) =v". 
Thus v vg v". 
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Proposition 7.10 [f V is a vector space over a field F and if G is a group of 
automorphisms of V then all elements of G have the same rank. 


Proof If a € G then, by Proposition 6.11, rk(o1) = rk(aa—!) < rk(a@) = rk(ao}) 
<rk(o,) and so rk(@) = rk(o1). 


Exercises 


Exercise 316 
Let V be a vector space over GF(3). Find an endomorphism a of V satisfying 
a(v)+a(v) =v forallve V. 


Exercise 317 

Let V bea vector space finitely generated over a field F and let a, B, y € End(V). 
Find necessary and sufficient conditions for there to exist an endomorphism 6 of 
V satisfying ayB = Bea. 


Exercise 318 
Let F = GF(2) and let n be a positive integer. Let a : F” — F” be the function 
ay ay 


defined bya: | : |++ | : |, where 0! = 1 and 1’ =0. Is a an endomorphism 


of F”? 


Exercise 319 
Let a, 6B : Q(X] > Q[X] be defined by a: p(X) bh Xp(X) and B: p(X) h 
X? pO. Show that a, 6, and a — £ are all monic endomorphisms of Q[X]. 


Exercise 320 

Let V be a finitely-generated vector space over a field F and let aw € End(V). 
Show that a is not monic if and only if there exists an endomorphism £ ¥ oo of 
V satisfying aB = o0. 


Exercise 321 
Let V be a vector space over a field F and let a € End(V). Show that ker(@) = 
ker(a?) if and only if ker(a@) and im(q@) are disjoint. 


Exercise 322 
Let V be a vector space over a field F and let a € End(V). Show that im(@) = 
im(a) if and only if V = ker(a@) + im(q@). 
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Exercise 323 

Let V be a vector space over a field F and let K = F x V x End(V), which is 
again a vector space over F’. Define an operation © on K by setting (a, v,a) © 
(b, w, B) = (ab,aw + B(v), Ba). Is (K, ©) an F-algebra? Is it associative? Is it 
unital? 


Exercise 324 
Let V be a vector space over a field F, and let Aff(V,V) be the set of all 
affine transformations from V to itself. Is Aff(V, V), on which we have defined 
the operations of addition and composition of functions, an associative unital 
F-algebra? 


Exercise 325 


Let w € Aut(R2) be defined by a: E 


b 
tal subalgebra of End(IR*). Show that it is proper by giving an example of an 
endomorphism of R? not in this subalgebra. 


aa eal Show that R{a, 01} is a uni- 


Exercise 326 

Let V be the space of all real-valued functions on the interval [—1, 1] which are 
infinitely differentiable, and let 5 be the endomorphism of V which assigns to 
each function f its derivative. Find the kernel and image of 6. 


Exercise 327 

Let a: C > C be the function defined by a: a+ bite —b-+ai. Is a an endo- 
morphism of C considered as a vector space over R? Is it an endomorphism of 
C considered as a vector space over itself? 


Exercise 328 
Let V be a vector space of finite dimension n over a field F and let aw € End(V). 
Show that there exists an automorphism £ of V satisfying aBa =a. 


Exercise 329 
Let V = M2,2(R), which is a vector space over R. Let a € V” be defined by 
a a a a : 
eel eS lau! taal . Is @ an endomorphism of V? 
a2, a22 laz1|_ |a22\ 
Exercise 330 
Consider R as a vector space over Q and let a be an endomorphism of this space 
satisfying the condition that there exists an ag € R such that @ is continuous at ao. 
Show that @ is continuous at every a € R. 


Exercise 331 

Let A be a nonempty set and let V be the collection of all subsets of A, con- 
sidered as a vector space over GF(2). For which subsets C of A is the function 
Bt BUC anendomorphism of V? 
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Exercise 332 

Let V be a vector space of finite dimension n over a field F and let {a;; | 
1 <i, j <n} be acollection of endomorphisms of V, not all of which are equal 
to oo, satisfying the condition that 


Bite = ain if j =k, 
sw@kh =) Gy otherwise. 
Show that there exists a basis {vj,..., v,} of V such that 
vj ifi=k, 
aj (Vj;) = 3 : 
jk (i) Oy otherwise. 


Exercise 333 

Let V be a vector space of finite dimension n over a field F and choose an 
element a € End(V). Let g : End(V) — End(V) be the function defined by 
Bt Ba. This is an endomorphism of End(V), considered as a vector space 
over F’. Show that a positive integer n satisfies a” = oo if and only if g” is 
the 0-function. 


Exercise 334 

Let a be an endomorphism of R? satisfying the condition that wa? = op. Show 
that there exists a linear transformation f : R? > R and that there exists a vector 
y € R? satisfying w(v) = B(v)y for all v € R?. 


Exercise 335 

For each 04a ER, let 6B, : C > C be the function defined by Bg : zt z+ azZ. 
Show that 6, is an endomorphism of C considered as a vector space over R, and 
describe its image and kernel. 


Exercise 336 
Let V be a vector space finitely generated over Q and let a, 6 € End(V) satisfy 
3a3 + Ta? — 2B + 4a — 0; = 09. Show that wf = Ba. 


Exercise 337 

Let F be a field of characteristic other than 2 and let V be a vector space of finite 
dimension n over F’. Let a be an endomorphism of V satisfying the condition 
that a” = 01. Show that rk(o, — w) + rk(o) +a) =n. 


Exercise 338 

Let V be a vector space over a field F' which is not finitely generated, and let 
09 a € End(V). Set A = {6 € End(V) | a6 = 01}. Show that if A has more 
than one element then it is infinite. 


Exercise 339 
Let V be a vector space over a field F having dimension greater than 1. Show 
that there exists a function a € V” which is not an endomorphism of V but 
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which nonetheless satisfies the condition that a(av) = aa(v) for all a € F and 
allue V. 


Exercise 340 
Let V be a vector space over a field F satisfying the condition that w@8 = Ba for 
all a, 8 € End(V). Show that dim(V) = 1. 


Exercise 341 

Let V = M?2x2(R), considered as a vector space over R. Let a: V — V be the 
“|| a+2b+c+2d a | 

d 3a+6b+2c+5d a+2b+c+2d | 

Is @ an endomorphism of V ? Is it an automorphism of V? 


function defined by a : 


Exercise 342 

Let V be the vector space of all continuous functions from R to itself and let 
a:V— V be the function defined by a: f(x) hb [x? + sin(x) + 2) f (x). Show 
that w is an automorphism of V. 


Exercise 343 
Let F bea field and let a : F[X] — F[X] be the function defined by a : p(X) bh 
p(X + 1). Is w an endomorphism of F[X]? Is it an automorphism? 


Exercise 344 

Let F bea field and, for eacha € F, let 6, be the endomorphism of FX] defined 
by 6, : p(X) p(X +a). Let a € End(F[X]) satisfy a(X) € F and a6, = Oga 
for all a € F. Can a be a monomorphism? 


Exercise 345 


a a—2b 
Let a € End(R?) be given by a: | b | c . Is @ an automorphism 
c a—b 


of R?? 


Exercise 346 
Let a be the endomorphism of R‘©) defined by 


a :[a}, a2, a3,...] > [b1, bz, b3,...], 


where by, = Le ie for each h > 1. Show that @ is an automor- 
phism satisfying wa =a7!. 

Exercise 347 

Let V be a vector space finitely generated over R and let a be an endomorphism 
of V satisfying a> + 4a? + 2a + 0; = 00. Show that a € Aut(V). 
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Exercise 348 

Let V be a vector space over a field F and let a, 6 € End(V) satisfy wB = 01. Set 
gy = 0; — Ba. Show that for every integer n > 1 we have o; = ae. Bk pak + 
Ba" ; 


Exercise 349 

Let V be the space of all polynomial functions from the interval [0, 1] on the 
real line to R. Let w and £ be the endomorphisms of V defined by a(f) : x 
i f@)dt and B(f): xh le f(@) dt. Find im(@ + 8). Is it true that wB = Ba? 


Exercise 350 

Let F be a field and let V = F®. Let n > 1 be an integer. Each vector 
d, 

y= | : | € F” defines an endomorphism 6, of V by 6y : [a1,a2,...] 
dy 

[b1, bz, ...], where by = 7) an—14id;, for h = 1,2,.... Show that if 6, is a 

monomorphism then the polynomial p(X) = )~/_, d;X i~l € F[X] has no roots 

in F. 


Exercise 351 
Let F = GF(5) and let V = F?. How many endomorphisms a of V satisfy the 


1 2 0 1 
conditions a 0 =]1] anda 3 =/]11]? 
0) 0 0 1 


Exercise 352 

Let V be the set of all continuous functions from R to itself, which is a vector 
space over R. Let a: V > V be the function defined by a(f) : xt» f (5) for all 
x € Randall f € V. Is a an automorphism of V? 


Exercise 353 

Let V = R®™ and let W be the subspace of V consisting of all convergent se- 
quences. Let a € End(V) be defined by a@: [a1, a2,...] > [bi, b2,...], where 
by, = Lye? a;) forallh > 1. If v € V satisfies a(v) € W, is v itself necessarily 
inW? 


Exercise 354 

Let V be a vector space over a field F and let w € Aut(V). Let Wi,..., Wx be 
subspaces of V satisfying V = B_, W;. For each 1 <i <k, let Y; = {a(w) | 
w € Wi}. Is V=Q@i_, Yi? 


Exercise 355 
Consider R as a vector space over Q. An endomorphism a of this space is 
bounded if and only if there exists a nonnegative real number m(q@) satisfying 
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the condition that |a(x)| < m(a)|x| for all x € R. Does the set of all bounded 
endomorphisms of R form an R-subalgebra of End(R)? 


Exercise 356 

Let F be a field, let n be a positive integer, and let V = Mpxn(F). Given a 
matrix B € V, is the function ag : V > V defined by ag: At AB+ BA an 
endomorphism of V? 


Exercise 357 

Let F be a field and let V = F[X]. Let 5 € End(V) be the formal differentiation 
function and let a € End(V) be defined by a: p(X) Xp(X). Show that ad — 
6a = 0}. 


Exercise 358 
Let V be a nontrivial vector space over a field F’. Is the set of all automorphisms 
of V a subspace of the vector space End(V) over F'? 


Exercise 359 
Consider GF(3) as a vector space over itself. Does there exist an automorphism 
of this space other than 01? 


Exercise 360 

Let V = F™ for some field F. Each w = [c}, c2,...] € V defines a function By : 
V— V by By : [a, a2,...] [a1,aic1 + ay, (ajc) + a2)c2 + 43,.. Ae Show 
that 6, is an automorphism of V. 


Exercise 361 

Let V bea vector space over a field F'; leta~ € End(V) and let 6 € Aut(V). Define 
.y2 2 ; | U B(v) 

the function 0: V“ > V* by setting 6: | a Fe real Is 6 necessarily 

an automorphism of V7? 


Exercise 362 


= 2 .| a 2b 
Let F = GF(5) and let a € Aut(F~) be defined by a : H med P ei a Show 


that there exists a positive integer / satisfying a/+! = @ and find the smallest 
such integer h. 


Exercise 363 

Let F be a field of characteristic other than 2. Let V be a vector space over F 
and let a, 8, y, 6 be endomorphisms of V satisfying the condition that a — 6 and 
a+ 6 are automorphisms of V. Show that there exist endomorphisms g and y 
of V satisfying ga + wB =y and Wa+ gp =6. 
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Exercise 364 

Let V be a vector space of finite dimension n over a field F. Let a € End(V) 
and assume that there exists a vector in y € V satisfying the condition that D = 
{a(y), w7(y),...,a(y)} is a basis for V. Show that D! = {y, a(y),...,a”~!(y)} 
is also a basis for V and that a € Aut(V). 


Exercise 365 
Let F be a field and let V = F™. Let a be the endomorphism of V defined by 
a(f):ith f@+1) forall f € V. Show that a—co, ¢ Aut(V) forallO Ace F. 


Exercise 366 

Let V be a vector space of finite dimension 7 over a field F , and letO<k<n 
be a positive integer. Let Ax be the set of all subspaces of V having dimension k. 
Let a € Aut(V) and, for each W € Ax, let 0,(W) = {a(w) | w € W}. Show that 
the function @, is a permutation of Ax. 


Exercise 367 
Let r, s, and ¢ be distinct real numbers and let a be the endomorphism of R3 


a a+ br + cr? 
defined bya: | b | + | a+bs+cs? |. Is @ an automorphism of R*? 
c a+bt+ ct? 


Exercise 368 
Let V be a vector space over a field F and let a € End(V). Show that W = 
J°2 , ker(a') is a subspace of V which is invariant under a. 


Exercise 369 
Let « and £ be the endomorphisms of Q* defined by 


a 2a — 2b — 2c — 2d a 0 
aa b a 5b—c—d and’ BP b vs —b+2c+3d 
“le —b+5c-—d “le 2b — 3c + 6d 
d —b—c+5d d 3b + 6c + 2d 


Find two nontrivial proper subspaces of Q* which are invariant both under a and 
under f. 


Exercise 370 
Let F be a field and let V = F*. Let a be the endomorphism of V defined by 


a a—b 
b a—b , ‘ ' 

a ir a . Does there exist a two-dimensional subspace of V 
d c—b-d 


invariant under a? 
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Exercise 371 

Let V be a vector space over a field F and let a € End(V). If W and Y are 
subspaces of V which are invariant under a, show that both W + Y and WN Y 
are invariant under a. 


Exercise 372 
Let W be a subspace of a vector space V over a field F and let S be the set of all 
a € End(V) such that W is invariant under qa. Is S necessarily an F'-subalgebra 
of End(V)? 


Exercise 373 

Let V be a vector space over a field F and let a € End(V). If W is a subspace of 
V, show that the set of all subspaces of W which are invariant under a, partially 
ordered by inclusion, has a maximal element. 


Exercise 374 

Let V = R®™ and let W be the subspace of V consisting of all sequences 
[a 1, a2, ...] for which the series peat a; converges. Let o be a permutation of the 
set of all positive integers and let a € End(V) be defined by @ : [a), a2,...] > 
[4¢(1), 4o(2),---]. Is W invariant under a? 


Exercise 375 

Let V be a vector space over a field F. Let 0 #c € F and let aw € End(V). Let 
{x0,X1,---,Xn} be a set of vectors in V satisfying a(xo9) = cxo and a(x;) — cxj = 
xj—1 for all 1 <i <n. Show that F{xo, x1,..., Xn} is a subspace of V which is 
invariant under a. 


Exercise 376 

Let F be a field which is not finite and let V be a vector space over F having 
dimension greater than |. For each 04 c € F, show that there exist infinitely- 
many distinct subspaces of V which are invariant under the endomorphism o; 
of V. 


Exercise 377 

Let V be a vector space of finite dimension n over a field F. Let a € End(V) 
and let 6 € End(V) satisfy p =a. Find a positive integer k such that rk(6) < 
zitk(a) +n]. 


Exercise 378 

Let a and # be endomorphisms of a vector space V over a field F and let 
6 € Aut(V) satisfy 6a = BO. Show that a subspace W of V is invariant under 
a if and only if W’ = {0(w) | w € W} is invariant under B. 


Exercise 379 
Let a and £ be endomorphisms of a vector space V over a field F' satisfying 
aB = Ba. Is ker(q) invariant under 6? 
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Exercise 380 
Let V be a vector space over a field F and let a € End(V) be a projection. Show 
that o; — a@ is also a projection. 


Exercise 381 
Let V be a vector space finitely generated over a field F and let a € End(V) 
satisfy the condition a*(o; — w) = 09. Is w necessarily a projection? 


Exercise 382 

Let V be the space of all continuous functions from R to itself and let W = 
R{sin(x), cos(x)} C V. Let 6 be the endomorphism of W which assigns to each 
function its derivative. Find a polynomial p(X) € R[X] of degree 2 satisfying 


p(d) = 00. 


Exercise 383 
Let V bea vector space finitely generated over Q and assume that there exists an 
a € Aut(V) satisfying a~! = a? + a. Show that dim(V) is divisible by 3. 


Exercise 384 
a\ 


Let n be a positive integer and let G = : | €R") a; >Oforalll<i<n 


an 
Let a be an endomorphism of R” satisfying the condition that a(v) € G implies 
that v € G. Show that a € Aut(R”). 


Exercise 385 

Let V be a vector space over a field F and let W and Y be subspaces of V 
satisfying W + Y = V. Let Y’ be a complement of Y in V and let Y” be a 
complement of WM Y in W. Show that Y’= Y”. 


Exercise 386 

Let F be a field of characteristic other than 2 and let V be a vector space over F’. 
Let a, B € End(V) be projections satisfying the condition that a + 6 is also a 
projection. Show that a8 = Ba = oo. 


Exercise 387 
Let V be a vector space over F and let a, 6B € End(V). Show that w and 6 are 
projections satisfying ker(@) = ker() if and only if a8 =a@ and Ba = B. 


Exercise 388 
Let V be a vector space finitely generated over a field F and let a 4 oj be an 
endomorphism of V which is a product of projections. Show that a ¢ Aut(V). 


Exercise 389 
Let V be a vector space over Q and let w € End(V). Show that w is a projection 
if and only if (2a — 01)* =o}. 
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Exercise 390 
Let a and f be endomorphisms of a vector space V over a field F and let 
Ff (X) € F[X] satisfy f(a@B) = o9. Set g(X) = Xf (X). Show that g(Ba) = o9. 


Exercise 391 
Let V = R® and let g € V. Find necessary and sufficient conditions on g for the 
endomorphism f +> gf of V to be a projection. 


Exercise 392 

Let W be a subspace of a vector space V over a field F which is invariant un- 
der an endomorphism a of V. Let 6 € End(V) be a projection satisfying the 
condition that im(8) = W. Show that Ba = af. 


Exercise 393 

Let V be a vector space of finite dimension n over a field F and let aw € End(V). 
Show that there exists an automorphism 6 of V anda projection 6 of V satisfying 
a= pe. 


Exercise 394 

Let F be a field of characteristic other than 2 and let V be a vector space over F’. 
Let a € End(V) be a projection satisfying the condition that a — f is a projection 
for all 6 € End(V). Show that a = 01. 


Exercise 395 
Let V be a vector space over F and let a, 6 € End(V) be projections satisfying 
the condition that im(@) and im() are disjoint. Is it necessarily true that a6 = 


Ba? 


Exercise 396 

Let V bea vector space of finite dimension n over a field F and let S = End(V)\ 
Aut(V). For a, 6 € S, show that im(a~) = im(f) if and only if {a0 | 0 € S} = 
{Be |g € S}. 


Exercise 397 

Let F be a field. Does there exist an endomorphism a of F? which is not a 
projection satisfying the condition that a7 is a projection equal neither to og nor 
to oj. 


Exercise 398 

Let V be a vector space finite dimensional over a field F and let a be an endo- 
morphism of V. Show that there exist a positive integer k such that im(a*) and 
ker(a*) are disjoint. 
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Exercise 399 

Let F be a field of characteristic other than 2, and let V be a vector space over F’. 
Let a € End(V) satisfy a? = a. Show that V = W; @ W2 ® W3, where W, = 
{ve V|a(v) =v}, W2={v € V| a(v) = —v}, and W3 = ker(@). 


Exercise 400 

Let F be a field of characteristic other than 2 and let V be a finitely-generated 
vector space over F’. Show that every endomorphism of V is the sum of two 
automorphisms of V. 


Exercise 401 

Let n > 1 be an integer and let 6 : R” — R be the function defined by 
ay 

@:| : |e Vey ae Assume that we can define an operation e on R” sat- 


an 
isfying the condition that (R”, e) is an associative unital R-algebra with multi- 
plicative identity e, and also satisfying the condition that 6(v e w) = 0(v)0(w) 
for all v, w € R”. Show that (R”, +, e) is a division algebra over R. 


Exercise 402 

Any sequence v = [dj,d2,...] € R® defines an endomorphism a, of R[X] 
which acts on elements of the canonical basis of R[X] according to the rule 
Oy 2X" > Vg (7) KY a41X"* for each nonnegative integer n. Given a € R, 
find v, w € R™ such that ay : p(X) p(X +a) and ay: p(X) p(X +a) — 
p(a). 


Exercise 403 

Let V be a vector space over a field F and let G be a group of automorphisms 
of V. For uv € V, define the stabilizer of v in G to be Gy = {a € G| a(v) = v}. 
Is this necessarily a group of automorphisms of V? 


Representation of Linear Transformations 8 
by Matrices 


In this chapter, we show how we can study linear transformations between finitely- 
generated vector spaces by studying matrices. Let V and W be finitely-generated 
vector spaces over a field F, where dim(V) =n and dim(W) = k. Fix bases 
B={v1,...,Un} of V and D= {wy ,..., we} of W. From Proposition 5.4, we know 
that if we are given a linear transformation a € Hom(V, W) then foreach 1 <j <n 
there exist scalars a1j,..., a,j satisfying the condition a(v;) = ae ajjwi, and 
that these scalars are in fact uniquely determined by a. Thus a@ defines a matrix 
laij] € Mkxn(F). Conversely, assume we have a matrix A = [ajj] € Mikxn(F). 
Then we know that every vector v in V can be written in a unique way in the form 
yo bjvj, and so A defines a linear transformation a ¢ Hom(V, W) by setting 


a:UvbK a int ajjb;)w;. Moreover, it is clear that different linear transforma- 
tions in Hom(V, W) define different matrices in Mxx,(F) and different matrices 
in Mxxn(F) define different linear transformations in Hom(V, W). We summarize 
the above remarks in the following proposition. 


With kind permission of the Special collections, Fine Arts Library, Harvard Univer- 
sity. 

The theory of matrices and their relation to linear transformations was 
developed in detail by the nineteenth-century British mathematician 
Sir Arthur Cayley, one of the most prolific researchers in history. 


Proposition 8.1 Let V be a vector space of finite dimension n over a field 
F and let W be a vector space of finite dimension k over F. For every 
basis B of V and every basis D of W there exists a bijective function 
Ppp: Hom(V, W) > Mexn(F), which is an isomorphism of vector spaces 
over F. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 133 
DOI 10.1007/978-94-007-2636-9_8, © Springer Science+Business Media B.V. 2012 
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Proof We have already seen that if B = {v,,..., v,} and D= {w1,..., wx}, then 
the function ®gp is defined by ®gp(a) = [a;;], where a(vj) = yy aj; w; for all 
1 < j <n, and that this function is bijective. We are therefore left to show that this 
is a linear transformation. Indeed, if gp (a) = [a;;] and ®g p(B) = [b;;] then 


k k k 


(a+ Bus) = Yai + bij)wi = So aij + So bijwi =a(vj) + B(v;) 


i=1 i=1 i=l 


for all | < j <n, and so @gp(a+ B) = @gp(a) + Pep (BP). Similarly, if ce F 
then (ca)(vj;) = 7 cajjwi = eC, ajjw;) = c(a(v;)) for all 1 < j <n, and 
so Pgp(ca) = c®gp(a). Thus we see that Pzgp is indeed a linear transformation 
and thus also an isomorphism. 


We have already seen that, in the above situation, dim(Mx x, (F')) = kn and so, 
by Proposition 6.9, we also see that dim(Hom(V, W)) = kn. 


Example Let V = R? and let B be the canonical basis on V. Each vector 


a 
v= | a2 | € V defines a linear transformation a, : V — V given by ay : wb 
a3 
0 -a @ 
v x w. Then @gg(Q,)=] a 0 -a, 
-—a2 a 0 


0.5 0.5 0 
B= —0.5 |, O |,} 0.5 
0 —0.5 0.5 


‘s 
of V and of D= tio} of W. If | s | € R? then there exist b;,b2,b3 ER 
, 


r 0.5 0.5 0 bi +b2 
satisfying | s | =b; | —0.5]}+b.| 0 | +63] 0.5 | = 5] —b1 +53 |, and 
t 0 =0.5 0.5 —b) + b3 


so we have 2r = b; + b2, 2s = —b, + b3, and 2t = —b2 + b3. From this we get 
bb =r—st+t,bo=r+s—t,andb3=r+s-+t. 
The matrix A = k : i] defines a linear transformation a € Hom(V, W) 


given by 
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0.5 0.5 0 
a:b,} —0.5}+b2 0 + b3} 0.5 
0 —0.5 0.5 


1 1 
> (3b; + 5b2 + 7b3) H + (4b; + 8b2 + 2b3) 16 | : 


so 


a S = 115r-+95-+59] | ]+c4r-+ 6529] 9 | 
t 


29r + 15s + 3t 
15r+9s+5t |° 


It is very important to emphasize that the matrix representation of a linear trans- 
formation depends on the bases which we fixed at the beginning, and on the order 
in which the elements of the bases are written! If we choose different bases or write 
the elements of a chosen basis in a different order, we will get a different matrix. 
Shortly, we will consider the relation between the matrices which represent a given 
linear transformation with respect to different bases. 

Let V be a vector space finitely generated over a field F,, let a be an endo- 
morphism of V, and let W be a subspace of V which is invariant under w. As we 
have already seen, the restriction 6 of a to W is an endomorphism of W. Now, let 
B={v1,..., ug} be a basis for W, which we can expand to a basis D = {v,..., Un} 
for all of V. If Ppp (a) = [a;;] then for all 1 < j < k we have a(v;) = 4 Ajj Vis 
and so aj; = 0 whenever 1 < j < k and k <i <n. Thus we see that the ma- 
Ai Agi 
Y = F{vg41,.--, Un} of V is acomplement of W in V. If it too is invariant under a 
then we would also have Az; = O, and so @ is represented by a matrix composed 


trix Ppp(a) is of the form i where Ai; = ®gp(f). The subspace 


of two square matrices “strung out” along the diagonal. From a computational point 
of view, such a representation has distinct advantages. 

Beside addition and scalar multiplication of matrices, we can also define the 
product of two matrices, provided that these matrices are of suitable sizes. Let (K, e) 
be an associative unital algebra over a field F. If A = [ujj] € Mixn(K) and B= 
[w jn] € Mnx:(K) for some positive integers k, n, and t, we define the matrix AB to 
be the matrix [yjn] € Mxx:(K) where, for each 1 <i <k and all 1 <h <t, we set 
Yih = ae v;j © W jn. For the most part, we will be interested in this construction for 
the case K = F, but sometimes we will have need of the more general construction. 
Note that a necessary condition for the product of two matrices to be defined is that 
the number of columns in the first matrix be equal to the number of rows in the 
second matrix. 
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132 -1 02 -!1 
Example If A = € M2x3(Q) and B= 1041 -2)/e 
-1 2 1 
2 10 -3 
5 2 7 -14 , 
Maxa(Q) then AB =| 5 1 0 1G |e Mana(@) but BA isnot defined. 


Example If we consider the matrices 


23 2 = 
A= €M2x3(Q) and B=] 1 0] €M3x2(Q) 
-1 2 1 
2 1 
5 2 —2 -3 -—2 
then AB = E al € M2x2(Q) and BA = 2 3 21 €M3x3(Q). 
3 8 5 
by Cc} 
Suppose that A = [ajj]€ Mixn(F) andu=] : | ¢F".ThenAv=| : |e FE 
by Ck 
where, for each 1 <i <k, we have c; = ae ajjd;. Denoting the columns of A 
by u1,...,Un, we see that Av = ei bju;. So we conclude that if there exists 
0 
a nonzero vector v such that Av = | : |, then the columns of A must be linearly 
0 


dependent. If every element of F* is of the form Av for some v € F”, then the 
columns of A must form a generating set for F*. 
Let (K,e) be an associative unital algebra over a field F and let n be a pos- 
V1 Wi 


itive integer. If v=] : | andw=] : are elements of K” then v’ w = 


Un Wn 
[yoy vj; e wi] € M1x1(K). This is called the interior product of v and w. This 
1 x 1 matrix is usually identified with the scalar a vj ew; € K, which we will 
denote by v © w, in a departure from usual notation.! 

Dually, the exterior product of v and w is defined to be the matrix vw? = Lyijle 
Mnxn(K), where yj; = vj « w;. We will denote the exterior product of v and w 
by vA w. Notice that the exterior product is not commutative, but rather v A w = 
(w A v)!. Exterior products of vectors are encountered far less often than interior 
products, but have important applications in many areas, among them physics (in 
the Dirac model of quantum physics, interior products are called bra-ket products, 
whereas exterior products are called ket-bra products). 


'The usual notation is v - w, but that can cause confusion with the dot product, which we will study 
later, in the case that F = C. For that reason, also, we use the term “interior product” rather than 
the often-seen “inner product’. 
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In particular, we note the following: let K be an algebra over a field F, let 
AE Mexn(K), and let B € M,;(K). Let Ul azeas Ue be the rows of A and let 
w1,..., Ww; be the columns of B. Then AB = [c;;], where cj; = v; © w; for all 
1<i<kandalll<j<t. 

a) ba, 

Let F be a field, let n be a positive integer, let v= | : | and w= 

an Dn 
belong to F”, and let C = [cjj] € Mnxx(F). Then the computation of 
vO Cw= > 4 ai (iy cijb;) requires n? +n multiplications and n? — 1 ad- 
uy yi 
ditions. However, if we can find vectorsu= | : | and y=] : | in F” such 


Un Nn 
that C =u A y, then, by the distributive law, v © Cw = YY7_) ai (j=) Miyjbj) = 
emer ajuj ri y;b;) and this requires only 2n + 1 multiplications and 2n — 2 
additions. Similarly, if we can find vectors u, u’, y, y’ € F” such that C =u A y+ 
u’ A y’, then the computation of v © Cw requires 4n + 2 multiplications and 4n — 4 
additions. For large values of n, this can result in considerable saving, especially if 
the computation is to be repeated frequently. 


Example Combinatorial optimization is the area of mathematics dealing with the 
computational issues arising from finding optimal solutions to such problems as the 
traveling salesman problem, testing Hamiltonian graphs, sphere packing, etc. The 
general form of combinatorial optimization problems is the following: Let F be a 
subfield of R and let n be a positive integer. Assume that we have a nonempty finite 
(and in general very large) subset S of N” C F”. Usually, the set S arises from the 
characteristic functions of certain subsets of {1,...,} of interest in the problem. 
a 
Then, givenavectoru=| : | € F", we want to find min{s © v | s € S}. Note that 
an 
if we consider F' not as a subset of R but as a subset of the optimization algebra 
Roo, then the problem becomes one of computing p(a1,...,d,), where 


iy 

in 
is a polynomial in several indeterminates over R., (polynomials with coefficients in 
a semifield are defined in the same way as polynomials with coefficients in a field). 


Observe that multiplying a k x n matrix by an n x f matrix requires kt(n — 1) 
arithmetic operations. If these numbers are all very large, as is often the case in 
real-life applications of matrix theory, the computational overhead—and risk of ac- 
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cumulated errors due to rounding and truncation—is substantial.* We will keep this 
in mind throughout our discussion, and try to consider strategies of minimizing this 
risk. In this connection, we should note that the product of two matrices has an im- 
portant property: let (K, e) be an associative unital algebra over a field F assume 
that A = [v;j] € Mixn(K) and B = [wij] € Mnx1(K), where k, n, and t are posi- 
tive integers. Furthermore, let us pick positive integers 


1=k(1) <k(2) <-+- <k(p+ Dk, 


1=n(1) <n(Q) <:--<n(gt+)D=n, 
1=1(1) <tQ) <---<t(?r+)=t. 


Uk(i),n(j) ea Uk) nit) 
For all 1 <i < p andall 1 <j <q, let Ajj = : : 
URG+))n(j) ves URG+D.nGt+D 
All... Aig 
This allows us to write A in block form : ae : |. Note that these blocks 
Api --» Apg 
are not necessarily square matrices. In the same way, we can write B as a matrix 
By... Bry Cy... Cry 
: . : |. Then AB= : a : where, for each 1 <i < p and 
Bai eee Bat Cpt eee Cot 


each 1 <h <t, we have Cjy, = ar A;; Bjn. A sophisticated use of this method 
can substantially decrease the number of operations needed to multiply two matri- 
ces, as we shall see. Moreover, skilled partitioning of matrices can allow us to make 
use efficiently the aspects of modern computer architecture such as cache memories 
to further increase the speed of computation. 

Needless to say, this seemingly odd definition of the product of two matrices 
was not chosen at random. Indeed, it satisfies certain important properties. Thus, 
if (K,e) is an associative unital algebra over a field F, if k, n, t, and p are posi- 
tive integers, and if we have matrices A € Mi,,(K), B, Bj, Bo € Myx (K), and 
CeM:xp(K), then 
(1) A(BC) = (AB)C; 

(2) ACB + Bo) = AB, + AB; 

(3) (Bi + B2)C = B}C + BoC. 

As a consequence, we see that if B€ My,+(K) is given, then the function from 
Mrxn(K) to Mxx+(K) defined by At AB is a linear transformation of vector 
spaces. 


2We will often mention large matrices, without being too specific as to what that means. As a rule 
of thumb, a matrix is “large”, and calls for special treatment as such, when it cannot be stored in 
the RAM memory of whatever computer we are using for our computations. Such matrices occur 
in sufficiently-many applications that considerable research is devoted to dealing with them. 
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We also note that if A = [vjj] © Mxgxn(K) and B= [wjn] © Mnx1+(K), then 
Al € Maxe(K) and B? € Mixn(K) so BAT € M;x4%(K). Indeed, B? A? = 
[yn], where yri = paw w jh @ U;j- Hence, if K is also commutative (and in partic- 
ular if K = F), we have B’ A? =(AB)!. 


Matrix multiplication was first defined by 
the nineteenth-century French mathematician 
Jacques Philippe Binet. It took some getting 
used to; many decades later, the father of as- 
trophysics, Sir Arthur Eddington, still wrote 
“T cannot believe that anything so ugly as mul- 
tiplication of matrices is an essential part of the 
scheme of nature”. 


The definition of matrix multiplication is in fact a direct consequence of the rela- 
tion between matrices and linear transformations, which we have already observed. 
This is best seen in the following result. 


Proposition 8.2 Let V be a vector space of finite dimension n over a field 
F for which we have chosen a basis B = {v1,...,Un}, let W be a vec- 
tor space of finite dimension k over F for which we have chosen a basis 
D={w},..., we}, and let Y be a vector space of finite dimension t over F, 
for which we have chosen a basis E = {y,..., yz}. [fa € Hom(V, W) and 
B € Hom(W, Y) then ®ge(Ba) = Ope (B)Pzd(@). 


Proof Assume that ®gp(a) = [ajj] and ®p“(P) = [dpi]. Then 


t kon 


kon 
QA >Ur> Soo cjaijwi and Ba 7 Ue bes Yo cjbniaijYn. 


i=1 j=1 h=1 i=1 j=1 


showing the desired equality. 


We can extend the definition of matrix multiplication as follows: let h, k, and 
n be positive integers, and let V be a vector space over a field F. If A = [ajj] € 
Mnhxk(F) and if M = [vjr] € Mixn(V), we can define AM € Mj,xn(V) to be the 
matrix [u;;], where uj; = ~ ajjvjr for all 1 <i <h and 1 <t <n. Notice that 
if A, Be Mgyx(F) andif M,N € Mgxn(V) then 
(1) A(BM) = (AB)M; 

(2) A(M+N)=AM+AN; 
(3) (A+ B)M=AM-+ BM. 

In general, and especially when we are talking of actual computations, it is easier 
to work with matrices than with linear transformations, and indeed most of the mod- 
ern computer software and hardware are designed to facilitate easy and speedy ma- 
trix computation. Therefore, given finitely-generated vector spaces V and W over a 
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field F’, it is usual to fix bases for them and then identify Hom(V, W) with the space 
of all matrices over F of the appropriate size. The choice of the correct bases then 
becomes critical, and we will focus on that throughout the following discussions. 
Such a choice usually depends on the problem at hand. In particular, the automatic 
choice of canonical bases, when they exist, may not be the best for a given prob- 
lem, and can entail a considerable cost both in computational time and numerical 
accuracy. 


Exercises 


Exercise 404 

Let V be the vector space over R composed of all polynomials in R[X] hav- 
ing degree less than 3 and let W be the vector space over R composed of all 
polynomials in R[X] having degree less than 4. Let a: V > W be the linear 
transformation defined by 


a:a+bX+cX* atb)+(b+oOX +(ateo)X*+(atb+o)Xx?. 
Select bases B = {1, X +1, X* + X +1} for V and 
D={X?-X*,x*-x,x-1,1} 
for W. Find the matrix gp(a). 


Exercise 405 


Let K = Mo2,.2(R) and let A = : i € K. Let D be the canonical basis of K. 


If a, B € End(K) are defined by a: X + XA and B: Xb AX, find Ppp(a) 
and ®pp(). 


Exercise 406 


0 1 2 3 
Given the matrix A=] 1 3 4 0] € M3,4(R), find the set of all matrices 
3 2 0 1 
0 
1 
0 


Exercise 407 


1 8 
Given the matrix A= |3 5] € M3 x2(Q), find the set of all matrices 
2. 2 


1 0 0 
BEM 2x3(Q) satisfying AB = | 0 1 O | and find the set of all matrices 
0 0 1 
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ee 1 O 
C EM 2 x3(Q) satisfying CA = 0 1 
Exercise 408 


1 1 2 
‘ ; 0 1 2 . 
Given the matrix A = 4 3 1 € M4,.3(R), find the set of all matrices 
2 1 -l 


1 0 0 
Bée M3 x4(R) satisfying BA=|0 1 O 
0 0 1 


Exercise 409 


a+b+c 
Find the matrix representing the linear transformation a : a . a4 


a 
b 
c 

from R? to R? with respect to the bases [is of R? and 


tL} Lol} on™ 


Exercise 410 
Find the set of all matrices A € M4,.3(R) satisfying the condition 


0 
0 
0 
0 


ooor 
ooroeo 
oe aa = =) 


Exercise 411 
Let a be an endomorphism of R? represented with respect to some basis by the 


0 2 -1 
matrix | —2 5 —2 |. Is aa projection? 
—4 8 -3 


Exercise 412 
Find the real numbers missing from the following equation: 


| es | 
x | 
— 
mm * 
*¥ 
Oo x 
es | 
| 

x | 
— 
* * 
x Ne 
Cox *¥ WwW 
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Exercise 413 


a 3a+2b 
Leta:| b | +> | —a—c | be an endomorphism of R?. Find the matrix repre- 
é a+ 3b 
1 0 
senting a with respect to the basis B = -1], O;,] 1 of R3. 
0 -l 0 


Exercise 414 
Let D = {v1, v2, v3} be a basis for R? and let @ be the endomorphism of R?* 


-1 -1 -3 
satisfying Ppp(a)= | —5 —2 —6 |. Find ker(a). 
2 1 3 


Exercise 415 
Let V be the subspace of R[X] consisting of all polynomials of degree less 
than 3 and choose the basis B = {1, X, PG for V. Let a € End(V) satisfy 


1 1 1 
Pap(a)=]}|O0 2 2]. Let Dbe the basis {1, X + 1,2X2 +4X +3} for V. 
0 0 3 


What is pp (a)? 


Exercise 416 

Let w € End(R?) be represented with respect to the canonical basis by the matrix 
2 2 0 
1 1 2 |. Find areal number a such that @ is represented with respect to the 
1 1 2 


0 0 0 
basis -l],] a j,jl by the matrix] 0 1 0O 
0 0 4 


Exercise 417 
Let V be the subspace of R[X] consisting of all polynomials of degree less than 3 
and let a € End(V) be defined by 
a:aX* +bX +c (at 2b+c)X* + Ga—b)X + (b+ 2c). 
Find @pp(a), where D = {X?2+ X +1, X2+ X, X?}. 


Exercise 418 
Find all rational numbers a for which there exists a nonzero matrix Be 


0 0 0 
a 1 1 000 
Max3(Q) satisfying B] 1 1 ajl= 
lal 0 0 0 
0 0 0 
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Exercise 419 
For which real numbers a does there exist a real number b (depending on a) 


satisfying ki : | : i = E i? 
1 lia ~10 14° 
1 b 


Exercise 420 

Let V = R® and let W be the subspace of V generated by the linearly- 
independent set B = {1, x, e*, xe*}. Let 5 be the endomorphism of W which 
assigns to each function its derivative. Find zg (6). 


Exercise 421 
Let B = {1+i,2 +}, which is a basis for C as a vector space over R. Let a be 
the endomorphism of this space defined by a : z+> Z. Find gz (a). 


Exercise 422 
Let F = GF(3) and let a : F? > F®* be the linear transformation defined by 
a 
a:| bie mo Let B : F* — F* be the linear transformation defined by 
c 
b 


B: | a i |: Find the matrix representing Ba with respect to the canonical 
2a 
bases. 


Exercise 423 
Let w € End(R*) be represented with respect to the canonical basis by the matrix 
3 -1 0 O 
-1 2 -l 0 
0 -1 2 -1 
0 oO -!l 1 
entries of a(v) are nonnegative, show that all entries of v are nonnegative. 


. Given a vector v € R* satisfying the condition that all 


Exercise 424 

Let V and W be vector spaces over a field F and choose bases {vj | i € 2} 
and {w; | j € A} for V and W, respectively. Let p: 82 x A — F be a func- 
tion satisfying the condition that the set {j € A | p(i, j) 4 0} is finite for each 
i€ 2. Leta,:V— W be the function defined as follows: if v = ier AiVis 
where J" is a finite subset of §2 and where the a; are scalars in F, then 
ap(V) = Vier Dijek aj p(i, j)w;. Show that ap is a linear transformation and 
that every linear transformation from V to W is of this form. 


Exercise 425 
Let k and n be positive integers, let v € IR”, and let A € Mxx,(R). Show that 
Av = O if and only if A’ Av=O. 
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Exercise 426 
Let A € M3 x2(R) and B € M2,3(R) be matrices satisfying 


8 2 -—2 
AB= 2 5 4 
—2 4 5 


Calculate BA. 


Exercise 427 
Find matrices A € M3,.2(R) and B € M2,;,3(R) satisfying 


1 1 1 
AB=|-2 0 -6 
Oo 1 -—2 


Exercise 428 

Let F be a field and let k 4 be positive integers. Let A, B € Mxxn(F) 
and let a: Myxk(F) > Mikxn(F) be the linear transformation defined by 
a:Ct>» ACB. Under which conditions is @ an isomorphism? 


Exercise 429 

Let F be a field and let n be a positive integer. Let W be a nontrivial subspace 
of the vector space V = M,,y,(F) satisfying the condition that if A € W and 
B € V then AB and BA both belong to W. Show that W= V. 


Exercise 430 

Let a,b,c, a’, b’,c’ € C satisfy the condition that aa’ + bb’ + cc’ = 2, and let 
a 

A=I-|b [a’ b’ |. Calculate A2. 
Cc 


Exercise 431 
Find a nonzero matrix A in M>,.3(R) satisfying v © Av =0 for all v € R?. 


Exercise 432 
Let w be the endomorphism of Q* represented with respect to the canonical basis 


101 -!1 
by the matrix 0 1 : i . Find a two-dimensional subspace of Q* which 
3 13 #4 


is invariant under a. 
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Exercise 433 

Let n be a positive integer and let F be a field. For each v, w € F”, consider 
the function tT, : Ff” — Fv defined by ty,w: yt) (w © y)v (this function is 
called the dyadic product function). Show that t, ,, a linear transformation. Is 
the function F” > Hom(F", Fv) defined w +> Ty,» a linear transformation? 


Exercise 434 

Let k <n be positive integers and let F be a field. Given a matrix A € Mxxn(F), 
do there necessarily exist matrices B, C € M,%(F) satisfying the condition that 
AB=OEM gy (F) and CA =O€ Mayyn(F)? 


Exercise 435 

Let A € Mixn(Q) be a matrix satisfying the condition that if v € Q” is a vector 
all of the components of which are nonnegative, then all of the components of 
Av are nonnegative. Are all of the entries in A necessarily nonnegative? 


Exercise 436 


Find the set of all matrices A in M3,.3(R) satisfying Az= 


ooo 
oor 
ooo 


Exercise 437 
Find the set of all real numbers a such that the endomorphism of R? represented 


l1 aa 
by the matrix | 2 2a 4 | with respect to the canonical basis is an automor- 
3 a 6 


phism. 


Exercise 438 
Find the set of all real numbers a and b such that the endomorphism of R? rep- 


1 a b 
resented by the matrix | 0 a_ 1 | with respect to the canonical basis is a pro- 
Oa il 


jection. 


Exercise 439 
Let A = [aij] € Mnxn(R) satisfy the condition that for each v € R” there exists 
a vector y € R” all entries in which are nonnegative satisfying Av = v+ y. Show 
that A = /. 


Exercise 440 
Let F = GF(2) and let K be the subset of M3,.3(F) consisting of O, J, and the 
following matrices: 
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0 0 0 1 0 1 1 1 1 1 
10 0;, Jt 1 O07, J|O0 O 17, |}O0 1 TY, 
01 1 0 1 0 1 1 1 1 1 0 
0 1 0 1 1 0 

1 0 1], and |1 1 1 

1 0 0 1 0 1 


Show that K, together with matrix addition and multiplication, is a field. What 
is its characteristic? Does there exist an element A of K such that every nonzero 


element of K is a power of A? 


The Algebra of Square Matrices 


We are now going to concentrate on the algebraic structure of sets of the form 
Mnxn(K), where n is a positive integer and (K, e) is an associative unital alge- 
bra over a field F. From what we have already seen, this is again an associative 
unital F-algebra, which will not be commutative if n > 1. The additive identity of 
this algebra is the matrix all of the entries of which equal 0x. The additive inverse 
of a matrix A = [djj] € Mnxn(K) is the matrix [—a;;]. The multiplicative identity 
of Mnxn(K) is the matrix E = [d;;] given by 


ae e ifi=j, 
~~) 0 otherwise, 
where e is the multiplicative identity of (K, e). 
The most important case is, of course, that of K = F. In this case, the additive 
identity is O and the multiplicative identity is the matrix J = [a;;] defined by 


fl ifisy, 
“i =)0 otherwise. 


If K is a vector space of dimension n over F and if B is a basis of K, then it is 
straightforward to verify that the function gg : End(K) > M,x,(F) is an iso- 
morphism of unital F-algebras. 

If F is a field and if n is a positive integer then, corresponding to the associa- 
tive F-algebra My xn(F), we have the Lie algebra My x»(F)~. This Lie algebra is 
called the general Lie algebra defined by F”. 


Example Let F be a field and let A = [ajj] € M4 x4(F). Then A can also 
aii 412 a3 a4 
a21 422 a23— «24 
431 432 a33, «34 
a4, a42 443° «44 
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be written in block form as € Mox2(K), where 
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K = Mo) x2(F). Addition and multiplication of matrices are so defined (and not 
accidentally!) that they give the same results whether performed in Ma,.4(F) or in 
M2x2(K). 


Example The set K of all analytic functions from C to itself is clearly an alge- 
bra over C. At the beginning of the twentieth century, G. D. Birkhoff made use of 
matrices in My xn(K) to study the properties of analytic functions. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

George D. Birkhoff was one of the leading American mathematicians 
at the beginning of the twentieth century, who worked in many areas 
of analysis. 


We begin by identifying some particularly-important square matrices over a 
unital associative F'-algebra K, and with them some significant subalgebras of 
Manxn(K). 

Let (K, e) is an associative unital F-algebra and let n be a positive integer. A ma- 
trix A = [dij] € Mnxn(K) is a diagonal matrix if and only if there exist elements 
C1,---,Cy Of K such that 


Pee Ci ifi= Js 
J Ox otherwise. 


The matrices O and E are diagonal. Moreover, the sum and product of diago- 
nal matrices are diagonal matrices, and so the set of all diagonal matrices is an 
F-subalgebra of Myy,(K). If K is commutative (and, in particular, if K = F) 
then this algebra is also commutative. The units of the subalgebra are all diago- 
nal matrices in which each c; is a unit of K (and hence surely nonzero). In this 
case, 


cy... Ox ror ... OK 


-1 
Ox evens Ch Ox vee Cy 


Example Let F be a field, let (K, e) is an associative unital F-algebra, and let n be 
a positive integer. A matrix A = [ajj] € Mnxn(K) is a scalar matrix if and only if 
there exists a scalar c € K such that a;; = c wheni = j and aj; = 0x otherwise. We 
denote this matrix by cE (and, in particular, cf when K = F). Scalar matrices are 
surely diagonal matrices, and both O and E are scalar matrices. Moreover, the sum 
and product of scalar matrices are scalar matrices. If c,d € K then (cE)(dE) = 
(dE)(cE) and if Ox £c € K is a unit, then (cE)(c7!E) = E. Hence the set of 
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all scalar matrices over F' forms an F-subalgebra of My .,(F), which is in fact a 
field. The function F ~ My y,(F) defined by c+ cI is a monic homomorphism 
of F-algebras, and so we can identify F with the subfield of all scalar matrices 
of Mnxn(F). Moreover, it is also easy to see that (cJ)A = A(cl) = cA for any 
Ae Manaxn(F). 


Let (K, e) is an associative unital F'-algebra, let n be a positive integer, and let d 
be a positive integer less than n. A matrix A = [v;;] € Mnxn(K) is a band matrix 
of width 2d — | if and only if vj; = Ox whenever |i — j| > d — 1. Thus, the band 
matrices of width | are the diagonal matrices. The matrix 


€ Msx5(R) 


CooOoNre 
ocooown 
ooocoro 
ROCCO 
roc oO 


is an example of a band matrix of width 3. The set of band matrices of fixed width 
is closed under addition and contains O and J, but is not necessarily closed under 
multiplication, and so is not a subalgebra of Myx, (K). However, it is closed under 
scalar multiplication and so is a subspace of the vector space Myx» (K) over F. 

Band matrices over a field are very important for numerical computations, espe- 
cially when d is small relative to n. Of particular importance are band matrices of 
width 3, which are also known as tridiagonal matrices, and have important use in 
the computation of quadratic splines and in the computation of extremal eigenval- 
ues of matrices; they also appear very often in methods of solution of differential 
equations. Tridiagonal matrices have the added advantage of being easily stored in 
a computer, since all we need to do is keep the three diagonals in which nonzero en- 
tries can occur. For example, a tridiagonal matrix in M 1990x1900 (R) has 1,000,000 
entries, of which at most 2998 are nonzero. 

A special type of tridiagonal matrix in M2 x27 (F'), which we will see again later, 


Ai O ... O 
O Ayn ... O 

is one of the form . . , where the A;; are 2 x 2 blocks. Note 
O O... Ann 


that this matrix can also be thought of as a diagonal matrix in My x»(K), where 
K = M)2x2(F). More generally, if d and n are positive integers, then any diagonal 
matrix in My yn(L), where L = Mgyq(K), is a band matrix of width 2d — 1 in 
Man xdn(K). 

Let (K, e) is an associative unital F-algebra and let n be a positive integer. A ma- 
trix A = [cjj] € Mnxn(K) is an upper-triangular matrix if and only if cj; = Ox 


150 9 The Algebra of Square Matrices 


1263 7 
03 10 0 
whenever i > j. Thus, the matrix | 0 0 O O O | € M5,5(R) is upper trian- 
000 0 1 
000 0 4 


gular. The set of all upper-triangular matrices includes the set of diagonal matrices, 
is closed under addition, and contains O and E. Moreover, it is closed under mul- 
tiplication, and so is an F-subalgebra of My »(K). In the case that K = F, we 
see that the dimension of M,.,(F) as a vector space over F' equals a(n +1). 
Upper-triangular matrices arise naturally in many applications, as we will see be- 
low. In a similar manner, we say that a matrix A = [cjj] € Mnxn(K) is a lower- 
triangular matrix if and only if cj; = Ox whenever i < j. Again, the set of all 
lower-triangular matrices is a subspace of the vector space Mnyn(K) over F and, 
indeed, an F-subalgebra. Note that a matrix A is upper triangular if and only if A? 
is lower triangular. 

A matrix A = [cjj] € Mnxn(K) is symmetric if and only if A = A’. That is, 
A is symmetric if and only if cjj = cj; for all 1 <i, j <n. If B is any matrix in 
Mnxn(K) then B + Bf is symmetric. If K is commutative and if C € Mxxn(K) 
for any positive integers k and n, then CC’ € Myyx(K) and C?C € Myxn(K) 
are symmetric. If n is a positive integer and F is a field, then v A v is a symmetric 
matrix in My xn(F) for all v € F”. Diagonal matrices are clearly symmetric and 
the set of symmetric matrices in M,,,(K) is closed under taking sums and scalar 
multiples, and so it is a subspace of the vector space My x»(K) over F. In the case 
K = F, the dimension of Myx, (F') equals a(n +1). However, the set of symmetric 


2 5 1 
matrices is not closed under products. For example, the matrices A= |5 2 O 
101 
12 1 13. 4 5 
and B=|2 O O} in M3,.3(R) are symmetric, but AB=]}] 9 10 5] is 
1 0 3 2 24 


not. In fact, in Chap. 13 we will show that if n > 1 then every matrix in Myx» (C) 
is a product of two symmetric matrices. 

We note, however, that if A and B are a commuting pair of symmetric matrices 
then (AB)? = (BA)! = AT B! = AB, 50 AB is again symmetric. 

A matrix A = [cj] € Mnxn(K) is skew symmetric if and only if A = —Al. 
The set of all skew-symmetric matrices in Mn xn(K) is again a subspace of 
Mnaxn(K). Note that if F is a field having characteristic other than 2, then any 
matrix A € M,,,,(K) can be written as the sum of a symmetric matrix and a skew- 
symmetric matrix, since A = 5(A +AT)+ (A — A’). In one of the examples after 
Proposition 5.14, we saw that this representation is in fact unique. The Lie product 
of two skew-symmetric matrices is again skew-symmetric. 


Example Let n be a positive integer. A matrix A = [ajj] € Mnxn(R) is a Markov 
matrix if and only if aj; => 0 for all 1 <i, j <n and ee anj = 1 for each 1 < 
h <n; itis a stochastic matrix if and only if both A and A! are Markov matrices. It 
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is easy to show that the product of two Markov matrices is again a Markov matrix 
and the product of two stochastic matrices in Myx» (R) is again a stochastic matrix. 

Markov matrices arise naturally in probability theory. In particular, if we have 
a system which, at each tick of a (discrete) clock, is in one of the distinct states 
S],.--,8, and if, for each 1 <i, j <n, we denote by p;; the probability that if the 
situation is in state i at a given time f then it will be in state j at time ¢ + 1, the 
matrix [pj;;] is a Markov matrix. 


Russian mathematician Andrei Andreyevich Markov made major 
contributions to probability theory at the beginning of the twentieth 
century. 


As we have already pointed out, a matrix O 4 A € M,y,(K) is not necessarily a 
unit. The units of My »(K) are known as nonsingular matrices; the other matrices 
are singular matrices. By what we have already noted, the product of nonsingular 
matrices is again nonsingular and if A is nonsingular then surely so is A~!. A matrix 
A satisfying A* = J is certainly nonsingular. Such matrices are called involutory 
matrices. 


With kind permission of the Harvard University Archives, HUP. 


These terms were first used by American mathematician Maxime 
Bocher in 1907. He was also the first to popularize the terms “linearly 
dependent” and “linearly independent”. 


a 


b 
Example If a,b € R with b 40 then ie —a) —a 


is involutory. 


Example We have already noted that if n is a positive integer then a diagonal 
matrix in A = [ajj] € Mnxn(C) is nonsingular when all of the diagonal entries 
ajj are nonzero. It therefore seems reasonable to conjecture that a matrix will 
be nonsingular if the diagonal entries are all “much greater” than the other en- 
tries. Indeed, this is true in the following sense: A sufficient condition for a ma- 
trix A = [aij] € Mnxn(C) to be nonsingular is that for each 1 < i <n we have 
laii| > DL i#i |a;j|. This result is known as the Diagonal Dominance Theorem. 
A proof of this theorem will be given in Chap. 15. 
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Let V be a vector space over F of dimension n and let B be a basis of V. Then 
there exists an endomorphism a@ of V such that A = ®ga(q). If A is nonsingular 
then there also exists an endomorphism f of V satisfying A~! = ®gzg(). This 
means that J = AA7! = PDpp(a)Pgpa(P) = Pep(af) and so af = oj, and simi- 
larly Ba = 0}. Therefore, aw € Aut(V) and B =a7!. 


Example Let F be a field and let n be a positive integer. If ce F and v,w e€ F”, 
then the matrix A = J + c(v A w) is nonsingular if and only if the scalar 1 + c(v© 
w) is nonzero. Indeed, direct computation shows that if | + c(v © w) 4 0, then 
A~!'=1+d(vAw), where d = —c[1+c(v© w)]7! and if 1+ c(v © w) = 0 then 
Av =v+c(v © w)v is the 0-vector, and so A must be singular. 


Example The multiplicative inverse of a “nice” nonsingular matrix may not be 
“nice”. Thus, if A € M,x,(R) is a nonsingular matrix all of the entries of which 
are nonnegative, it does not follow that all of the entries of A~! are nonnegative. 


1 1 1 
For example, if we choose A=} 1 2 1 | then direct computation shows us that 
1 1 2 
3 -1 -l 
A-!=] -1 1 O |.If A=[a;;] is the n x n tridiagonal matrix with a;; = 2 
-1 0 1 
for all 1 <i <n and aj; = —1 whenever |i — j| = 1, then not only is A! not 


tridiagonal, but in fact no entries in A~! equal 0, for any n > 1. 


Example If a matrix A € M,y,(4) can be written in block form [A;;], where Aj; 
is a nonsingular square matrix and A;; = O for i # j, then A is nonsingular, and 
Atl= [Bij], where Bij = As for each i and B;; = O for eachi ¥ j. In particular, 
if each Ajj; is involutory, then so is A. 


Example Let n be a prime positive integer. The complex number cy, = cos(*2) + 
i sin( 2m) is called a primitive root of unity of degree n, since it easy to check that 
c” = 1 but c" £1 for all 0 <h <n. Therefore, cy! = c”~! for all n. For each 
z€C, let F(z) € Myxn(C) be the matrix [a;;] defined by a;; = z@-DG-D for all 
1 <i, j <n. It is straightforward to show that the matrix F(c,) is nonsingular and, 
indeed, F(c,)~! = iF (cy, !). The endomorphism yg, of C” which is represented 
with respect to the canonical basis by the matrix F'(c,) is called the discrete Fourier 
transform of C”. This endomorphism is of great importance in applied mathematics. 
An algorithm, known as the fast Fourier transform (FFT), introduced by J.W. Coo- 
ley and John Tukey in 1965, allows one to calculate g,(v) in an order of n log(n) 
arithmetic operations, rather than n*, as one would anticipate. This facilitates the 
use of Fourier transforms in applications. A similar construction is also possible 
over finite fields, and especially over fields of the form GF(p). We will look at this 
example again in Chap. 15. 

A closely-related endomorphism, the discrete cosine transform, is used in defin- 
ing the JPEG algorithm for image compression. 
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With kind permission of the Smithsonian Institution. 


Joseph Fourier was a close friend of Napoleon and served for many 
years as permanent secretary of the Parisian Academy of Sciences. He 
worked primarily in applied mathematics, and developed many im- 
portant tools in this area. John Tukey was a twentieth-century Amer- 
ican statistician who developed many advanced mathematical tools in 
statistics. 


Let (K, e) be an algebra over a field F. A matrix representation of K by matrices 
over F is ahomomorphism of F-algebras from K to M,..(F) for some positive 
integer n. Matrix representations are a very important tool in studying the structure 
of algebras over fields. More generally, a representation of K over F is a homomotr- 
phism of F-algebras from K to End(V) for some vector space V, not necessarily 
finitely generated, over F. 


Example Recall that C is an algebra of dimension 2 over R. The function y : C > 


M02 x2(R) defined by y:a+ bir | is a matrix representation of C by 


a 
—b 
matrices over R. In fact, this representation is clearly monic and its image is the 
dons : b : 
subalgebra T of M2 2(R) consisting of all matrices of the form E ‘ , SO y 1S 


an R-algebra isomorphism from C to T. 


Let (K, e) be an associative unital algebra over a field F having multiplicative 
identity e, and let n be a positive integer. Let E be the multiplicative identity of 
Maxn(K) A matrix A = [ej] € Maxn(K) is an elementary matrix if and only if it 
is of one of the following forms: 

(1) Eng, the matrix formed from EF by interchanging the Ath and kth columns, 
where h £k; 

(2) En-c, the matrix formed from E by multiplying the hth column by Ox 4c eK; 

(3) Enx-c, the matrix formed from E by adding c times the kth column to the Ath 
column, where ) £k, where ce K. 

It is easy to verify that matrices of the form Ep, and Eng.¢ are always non- 
singular, with Ee = Ep, and ia, = Enx.—c. If c is a unit in K, then matrices 


of the form Ey.- are nonsingular, with Ene = E;,--1. Thus, if K is a field (and 
in particular, if K = F’), every elementary matrix of the form Ey., is nonsingular. 
We note that the transpose of an elementary matrix is again an elementary matrix. 
Indeed, Ej, = Enx and Ej... = En;c for all 1 <h,k <n and 0x #c € K, while 
Elie = Ekn:c for all 1<h#k<nandallce K. 

As the name clearly implies, there is a connection between the elementary au- 
tomorphisms which we defined previously and the elementary matrices. Indeed, if 
K = F and if B is the canonical basis of F”, then Eng = ®(Enk), Envc = P(En:c), 
and Enk:c = ®(Enk:c)- 
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Let us see what happens when one multiplies an arbitrary matrix in Myjy7(K) 

on the left by an elementary matrix: 
(1) If BE Mnxn(K) then E,;B is the matrix obtained from B by interchanging 
the hth and kth rows of B. Thus, for example, in. M4,4(Q) we look at the effect 


1 00 0 5 6 4 1 5 6 4 1 
aoe 0 0 0 1 3.2 2 2 = 3 3 2 2 
0 0 1 0 0 4 2 7 0 4 2 7 
0 1 0 0 3.3 2 2 3 22 2 
(2) If Be Mnxn(K) then E;.-B is the matrix obtained from B by multiplying 
the hth row of B by c. Thus, for example, in M4,4(Q) we look at the effect 
1 00 0 5 6 4 1 5 6 4 1 
ate 0 5 0 0 3. 2 2 2 _ 15 10 10 10 
, 0 0 1 0 0 4 2 7 0 4 2 +7 
0 0 0 1 3.3 2 2 3 3 2 2 


(3) If BE Ma xn(K) then Ej,x.-B is the matrix obtained from B by adding c times 
the hth row to the kth row. Thus, for example, in. M4.4(Q) we look at the effect 


1 0 0 0 5 6 4 1 5 6 4 1 

ot ia: 0 1 0 0 3.2 2.2 = 3 2 2 2 
"12 0 1 0 0 42 7 10 16 10 9 
000 1 3. 3:72) 2 3° 3° 2.2 


Proposition 9.1 If F is a field, if n is a positive integer, and if A,B,C, Dé 

Mnxn(F) then: 

(1) When A and B are nonsingular, so is AB, with (AB)~!=B-!A7!; 

(2) When AB is nonsingular, both A and B are nonsingular; 

(3) When A and B are nonsingular, A7!+B-!=A7!(B+ A)B7!; 

(4) When I + AB is nonsingular, so is I+ BA, and UI + BA)7! =I[—- 
BUI + AB)~!A; 

(5) (Guttman’s Theorem) If A is nonsingular and if v,w € F” satisfy 
the condition that 1+ w © A~!v #0, then the matrix A+ vAweE 
Mnhxn(F) is nonsingular and satisfies (A +v A w) !=A-!-(1+wo 
Aa!v)!(A7![v A wA7}). 

(6) (Sherman—Morrison—Woodbury Theorem) When the matrices C, D, 
D~! + AC7'B, and C + BDA are nonsingular, then (C + BDA)! = 
Cc loc pip Ac ay tact, 


Proof (1) This is a special case of a general remark about units in associative 
F-algebras, which we have already noted. 

(2) Let V a vector space of dimension n over F, and let D be a basis of V. Then 
there exist endomorphisms a and 6 of V satisfying A = ®pp(a) and B = Ppp (f), 
and so AB = @pp(af). Since AB is nonsingular, we know that a6 € Aut(V). 
Then there exists an automorphism y of V satisfying y(@B) = 01 = (aB)y. Then 
(ya)B = 0, =a(By) and so, by Proposition 7.4, we know that both a and f are 
automorphisms of V, and hence both A and B are nonsingular. 
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(3) This is an immediate consequence of the fact that A(A7!4+ B-!)B=B+A. 
(4) We note that 


(1+ BA)[I — BU + AB)'A] =1+ BA— (B+ BAB)(I+AB)'A 
=I1+BA—B(U1+AB)1+AB)'A 
=I1+BA—BA=I. 


(5) A simple calculation shows us that if x, y € F” satisfy the condition that 
c=1+y©x is nonzero, then 


(I—c7'[xa y])(1 +1 y]) =F4+xAy—¢€ Ay] -—c (Ay? 
=I+xay—c lex y]=1, 


and so (1 +[x A y)7! =I—c"![xa y]. Therefore, if we setd=1+woO A-ly 
then 


(AtvAw) = ee [vAw])}'=(+A 'vaw)) /A7! 
I- a (A> '[vA w])]A7! 
=A! -d'(A'[vAv]A7'), 


as required. 

(6) First, note that J-+C~!BDA = C7!(C+ BDA) and so, by (1), this matrix is 
nonsingular as well. By (4), ( + C7'BDA)~! =I — C~'BU + (DAC™!B)DA, 
and so 

(C+ BDA)! =[C(I+C7'BDA)] | 
=CB(r4+ pac-'n) palo" 


[ 
=c-!—c"'B[p“'(1+ DAC'B)] ‘Ac! 
=C 


Cao ac tay Act, 


as required. 


With kind permission of Nurit Guttman. 


Louis Guttman was a twentieth-century American/Israeli statistician 
and sociologist who developed many advanced mathematical tools for 
use in statistics. The Sherman—Morrison—Woodbury Theorem was in 
fact first published by British aeronautics professor W.J. Duncan, but 
is named after the twentieth-century American statisticians Jack Sher- 
man, Winifred J. Morrison, and Max Woodbury who used it exten- 
sively. 
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Guttman’s Theorem is important in the following context: assume we have cal- 
culated A~! for some square matrix A and now we have to calculate B~!, where 
B differs from A in only one entry. With the help of this result, we can make use of 
our knowledge of A~! to calculate B~! with relative ease and speed. The Sherman— 
Morrison—Woodbury Theorem has similar uses. 

In particular, we note from Proposition 9.1 that if A, B€ M,.,(F) then AB is 
nonsingular if and only if BA is nonsingular. We should also note that if A, B € 
Mnaxn(F) then B? A? = (AB)’, and so if A isa nonsingular matrix and B = Aa! 
then AB = J, andso B! A? = 17 = 7. Thus A? isalso nonsingular. Moreover, this 
also shows that (A’)—! = (A7!)F for every nonsingular matrix A € Myxn(F). 


Proposition 9.2 Let F be a field, let n be a positive integer, and let A, B € 
Mnxn(F), where A is nonsingular. Then there exist unique matrices C and D 
in Mnxn(F) satisfying CA = B= AD. 


Proof Define C = BA~! and D = A~'B. Then surely CA = B = AD. If C’ and 
D’ are matrices satisfying C/A = B = AD’ then C'! = (C'A)A~! = BA~! =C and 
D! = A~!(AD’) = A~'B = D, and so we have uniqueness. 


Example The matrices C and D in Proposition 9.2 need not be the same. For ex- 


ample, if A, B € M2 2(R) are defined by A = ‘ al and B= E ‘| then 


1 -3 1 —3 —8 -3 
-1_ = = 
are |e | os S| 


Proposition 9.3 Let F be a field, let n be a positive integer, and let 

A= [aij] € Mnxn(F). Then the following conditions are equivalent: 

(1) A is nonsingular; 

(2) The columns of A are distinct and the set of these columns is a linearly- 
independent subset of F”; 

(3) The rows of A are distinct and the set of these rows is a linearly- 
independent subset of M\xn(F). 


Proof (1) & (2): Denote the columns of A by yj,..., ¥,. Let V = F” and let 

B = {v1,...,Un} be the canonical basis of V. If two columns of A are equal 

or if the set of columns is linearly dependent, there exist scalars c1,..., Cn, not 
Cl 0 

all equal to 0, such that A] : | = )°_,ciyj = | : |. But if (1) holds, then 
Cn 0 
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Cl 0 0 
f= Aq! : | =] +: |, which is a contradiction. Therefore, (2) holds. Con- 
Cn 0 0 
versely, assume (2) holds. Then the endomorphism a of V given by vt Av isa 
monic and so an automorphism of V. But A = ®g~(q@), and so, as we have seen, A 
is nonsingular. 
(1) } (3): This follows directly from the equivalence of (1) and (2), given the 
fact that a matrix A is nonsingular if and only if A? is nonsingular. 


Example Let F bea field and let n > 1 be an integer. If v, w € F'”, then the columns 
of vA w E Mn xn(F) are linearly dependent and so v A w is always singular. 


Example If F is a field and if U = [ujj] € Mnxn(F) is an upper-triangular ma- 
trix satisfying the condition that u;; 4 0 for all 1 <i <n then, by Proposition 9.3, 
it is clear that U is nonsingular. We claim that, moreover, U~! is again upper 
triangular. Let us prove this contention by induction on n. It is clearly true for 
n = |. Assume therefore that n > 1 and that we have already shown that the in- 
verse of any upper-triangular matrix in M (,—1)x(n—1)(F) is upper-triangular. Write 
oq? 
Ay n-1 : 
U= |’ Pal where A € Mi—1)x(n-)(F), ye F" 7, and z=]: | . As- 
0 


_)_| B x 
sume that U~* = E b 
AB+yAw=I, Ax+by= Zz! unnw! = z, and uUnnb = 1, so we must have 
b= a #0 and w? =z. Therefore, y \ w = O and so B = A~!. By hypothesis, B 
is upper triangular and so U~! is again upper triangular. A similar argument holds 
for lower-triangular matrices. 


|: where B € M (n—1)x(n—1) (F) and w, x € F"—!. Then 


Proposition 9.4 Let F be a field and let n be a positive integer. A matrix in 
Mnxn(F) is nonsingular if and only if it is a product of elementary matrices. 


Proof Since each of the elementary matrices is nonsingular, we know that any prod- 
uct of elementary matrices is also nonsingular. Conversely, let A = [a;;] be a non- 
singular matrix in My xn(F) and let B = [b;;] be A~!. Then B is also nonsingular 
and so, by Proposition 9.3, the columns of B are distinct and the set of columns is 
linearly independent in F”. In particular, there exists a nonzero entry bj, in the first 
column of B. Multiply B on the left by Ey, to get a new matrix in which the (1, 1)- 


entry nonzero. Now multiply it on the left by E).., where c = Dis, , in order to get 
TF ge. F 


Ey * * 


amatrix of the form]. . . _ |. Now let 1 <t <n, and let d(t) be the ad- 
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ditive inverse of the (t, 1)-entry of the matrix. Multiplying the matrix on the left by 
Ey\.a), we will get a matrix with 0 in the (f, 1)-entry and so, after this for each such 
t, we see that a matrix B’ = C, B, where C, is a product of elementary matrices, and 


Lo* 2. % 
QO -* 2... 

which is of the form |... . |. This matrix is still nonsingular, since it is 
O* 4 


a product of two nonsingular matrices, and so its columns are distinct and form a 
linearly-independent subset of F”. Therefore, there exists a nonzero entry bj,, in the 
second column, with h > 1. Repeating the above procedure, we can find a matrix 


C2 which is a product of elementary matrices and such that C2C) B is of the form 
1 0° * a. * 


Qt fh) °F - gen 
ae . Continuing in this manner, we obtain matrices C),..., Cy, 
0 0 * jin. * 


each of them a product of elementary matrices, such that C, ---C ,B = I. Therefore, 
C,-::C, = B~! =A, as we wanted to show. 


Example Let F be a field and let n be a positive integer. Every permutation z of the 
set {1,...,} defines a matrix Az = [ajj] € Mnxn(F) by setting aj; = Lif j = (i) 
and a;; = 0 otherwise, called the permutation matrix defined by 7. This matrix is 
clearly a result of multiplying 7 by a number of elementary matrices of the form 
Enx, and so is nonsingular. 


The order of multiplication given in Proposition 9.4 is not unique. Indeed, we 
claim that it is possible to write any nonsingular matrix A € M);x»(F) in the form 
PC, where P is a permutation matrix and C is a product of elementary matrices of 
the form E;., and E;;.-. To see how this is done, we note that if 1 <i,h,k <m and 
ifc € F then Ej.c Eng = Eng Ej-c if i € {h, k} and Ep.c Ene = Eng Exc and a similar 
result holds for elementary matrices of the form E;;,- and Eyx. Thus, one by one, 
the elementary matrices the form E;,, can be “moved to the left” until we obtain the 
desired decomposition. 

Proposition 9.4 allows us to construct an algorithm for computing A~! when 
A is a nonsingular matrix in M,x,(F). First of all, we construct the matrix 
[I A] € My x2n(F) and on this matrix we perform a series of elementary opera- 
tions, namely operations which are the result of multiplying it on the left by elemen- 
tary matrices, which bring the right-hand block into the form 7. Then the left-hand 
block is A~!. To calculate A7! by this method, we use n> — 2n2 +n additions and 
n> multiplications. 
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12 3 
Example Consider the matrix A=|2 3 0} €.M3,3(Q). Therefore, we begin 
0 1 2 
3 
0 
2 


1001 2 
with the matrix | 0 1 0 2 3 € M3x6(Q). Then we 
0010 1 
100 1 2 3 
(1) Get |} —2 1 0 O -—I1 -—6 | after multiplying the first row by —2 and 
0 0 1 0 1 2 


adding it to the second row; 


100 1 2 3 
(2) Get} —2 1 0 O —-I1 -—6 | after adding the second row to the third row; 
—2 110 0 -4 
1 0 0 1 2 3 
(3) Get | 2 -1 0 0 1 6) after multiplying the second row by 
0.55 —0.25 -0.25 0 0 1 
—1 and then multiplying the third row by —0.25; 
1 0 0 1 2 3 
(4) Get} —1 0.5 15 0 1 O | after multiply the third row by —6 and 
0.5 —0.25 -0.25 0 0 1 
adding it to the second row; 
—0.5 0.75 0.75 1 2 0 
(5) Get | —1 0.5 15 O 1 | after multiplying the third row by 
0.5  —-0.25 -—0.25 0 0 1 
—3 and adding it to the first row; 
15 —0.25 —2.25 1 0 
(6) Finally, get | -—1 0.5 15 O 1 
0.5 —0.25 —0.25 0 0 
row by —2 and adding it to the first row. 


0 
0 | after multiplying the second 
1 


6 -1 -9 
Therefore, we see that A~! = ; —4 2 6 
2 -1 -l 


Example When one uses computer to compute matrix inverses, one must always 
be aware of hardware limitations. For example, one can show that Nievergelt’s 

; 888445 887112 
matrix A = 


887112 a € Mo x2(Q) is nonsingular, while the matrix 


B=A-— k a where c = =<iast (which is approximately 2.818 x 1077) is 


not. Nonetheless, a computer or calculator capable of only 12-digit accuracy cannot 
differentiate between the two. 


Example For each positive integer n, let H, € Mnxn(Q) be the matrix [a;;] in 
which aj; = a This matrix is called the n x n Hilbert matrix. Hilbert matrices 
are all nonsingular but, while their entries all lie between 0 and 1, the entries in their 
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inverses are very large. For example, Hy : equals 


36 —630 3360 —7560 7560 —2772 
—630 14700 —88200 211680 —220500 83160 
3360  —88200 564480 —1411200 1512000 = —582120 
—7560 211680 —1411200 3628800 —3969000 1552320 
7560 —220500 1512000 —3969000 4410000 —1746360 
—2772 83160 —582120 1552320 —1746360 698544 


Therefore, these matrices are often used as benchmarks to judge the efficiency and 
accuracy of computer programs to calculate matrix inverses. In particular if the com- 
puter we are using has only 7-digit accuracy, it is reasonable to assume that we will 
have a 100% error in computing H, : 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

German David Hilbert was one of the foremost mathematicians in the 
world at the beginning of the twentieth century. He and his students 
were among the first to study infinite-dimensional vector spaces. 


It is sometimes possible to use a representation of a nonsingular matrix A in block 
form in order to calculate A~!. Indeed, suppose that A € Myxn(F) is a matrix 


which can be written in block form ke ral where Aq, € Mgxk(F). If Aq 
21 22 


and C = Azo — An Aj A are both nonsingular, then A is also nonsingular, with 


Ata I -Ay An Ai O I O 
O I O cC!][-AnAy Oo} 


Similarly, if A227 and D=A,;—A 1259 Ari are both nonsingular, then A is also 
nonsingular, with 


pote I o|[D-! oO IT AA 
-Ay An O|| O AZILO a 


The matrices C and D are, respectively, the Schur complements of A;; and A22 
in A. These conditions, however, are sufficient but not necessary for A to be non- 
singular, as the following example shows. 
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With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Issai Schur was a twentieth-century German mathematician who is 
known primarily for his work in group theory. 


1 0 0 0 
0 0 1 O 
Example The matrix A = a 4 = € Max4(Q) is nonsingular, de- 
loo} [o 4 


spite the fact that all of the given 2 x 2 blocks are singular. 


It is important to make clear, however, that it is hardly ever necessary, in appli- 
cations, to actually compute the inverse of a nonsingular matrix. One is more likely 
to have to compute a product of the form A~! B, which can usually be done without 
explicitly computing A~! first. 

Let F be a field and let k and n be positive integers. Two matrices 
B,C € Mkxn(F) are equivalent if and only if there exist nonsingular matrices 
PE Mgx¢(F) and Q € Myx_(F) such that PBQ = C. This is, indeed, an equiv- 
alence relation on Mxy.n(F) since: 

(1) JBI = B for each such matrix B, showing that B is equivalent to itself; 

(2) If PBO=C then P-"'CQ"!=B; 

(3) If PBQ=C and P’'CQ’ = D' then (P’ P)B(QQ’) = D, where we note that 
both P’P and QQ’ are again nonsingular. 

Similarly, we say that B and C are row equivalent if and only if there exists a non- 
singular matrix P € Mxx(F) satisfying P B = C, and we say that B and C are col- 
umn equivalent if and only if there exists a nonsingular matrix Q € Myy_(F) satis- 
fying BQ = C. Both of these relations are also equivalence relations on Mxxn(F), 
and it is clear that if B and C are row equivalent then they are equivalent (take 
Q = !) and if they are column equivalent then they are equivalent (take P = /). 

Equivalence of matrices is a very strong concept. Indeed, it is easy to show that 


any matrix B € Mx xn(F) is equivalent to one which is in block form 2 . | 


Therefore, it is more useful to consider row equivalence of matrices as our basic 
tool. 

Now let V be a vector space of dimension n over a field F and choose bases B = 
{v1,..., Un} and D= {wj,..., wy} of V. For each 1 < j <n there exist elements 
Gjr+++>Qnj of F satisfying w; = ar gij vi. By Proposition 9.3, we know that the 
matrix Q = [q;;] is nonsingular. If v = )7}_, aiv; = 0); bj w; is an element of V, 
then we see that v = Vi=1 bwp= Ls bj (ja Gj Yi) = iO qijbj)vi 
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and so we must have aj = ei qij>; for all 1 <i <n. Thus we see that 


ay by 


an bn 
The matrix Q is called the change-of-basis matrix from D to B. 


Example Let F be a field, let n be a positive integer, and let V be the subspace of 
the vector space F[X] made up of all polynomials of degree at most n — 1. Then 
dim(V) =n, and it has a canonical basis B = {1, X,..., X"~!}. Let cy,..., cn be 
distinct scalars, and for each 1 <i <n, consider the polynomial 


1 
pi(X) =] ] —_(X - ej eV. 


ae 
pa 


This polynomial is called the ith Lagrange interpolation polynomial, and we will 
return to these polynomials below in another context. It is clear that 


1 ifi=yj, 
B= ie otherwise. 
Thus, for example, if n = 4 and if we choose c; = 1, cz = 3, c3 = 5, and c4 = 7, we 
obtain 
71 35 


(X) = Eas ee 
ae 16 48°" 16’ 


4G x? x ; 
PO = 6 ie 1G 
1 31 21 
X)= x x x ; 
POS ie 
23 5 
y= xX x x 
PAX) = 76 io 48 OG 
Returning to the general case, we see that the set D = {p)(X),..., Pn(X)} of 


Lagrange interpolation polynomials is linearly independent since, if we have 
>i) 4 pi(X) = 0, then for each 1 < h <n we have ay = )"_) aj pi(cn) = 0. 
Therefore, D is also a basis of V. If g(X) is an arbitrary polynomial in V then 
there exist scalars a1,...,d, satisfying qg(X) = )~'_, aj pi(X). Again, this implies 
that a; = q(c;) for alli. In particular, if g(X) = X* we see that X* = ae ck pj (X). 


‘ ‘ 5 2 é 
Therefore, the change of basis matrixfrom Dto Bis]... ~ |. A matrix 
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of this form is called a Vandermonde matrix, and such matrices are always nonsin- 
gular. 


With kind permission of ETH-Bibliothek Zurich, Image 
Archive (Lagrange). 

Joseph-Louis Lagrange was one of the applied 
mathematicians who surrounded Napoleon, and 
his book on analytical mechanics is considered 
a mathematical classic. Alexandre-Théophile 
Vandermonde was an eighteenth century French 
chemist and mathematician who studied determi- 
nants of matrices. Vandermonde matrices do not appear in his work, and it is not clear why 
they are named after him. 


Lagrange interpolation allows us to represent a polynomial p(X) of degree less 
than n in a computer not by its list of coefficients but rather by a list of its values 
P(a1),..-, P(dn) at n preselected elements of F’. Such representations can be used 
to obtain algorithms for rapid multiplication of polynomials, especially in the case 
the field F is finite (having n elements, of course). Indeed, if p(X) and q(X) are 
polynomials in F[X] of positive degree satisfying deg(p) + deg(q) =h <n, then 
P(X)q(X) is the unique polynomial t(X) of degree h satisfying t(a;) = p(aj)q (ai) 
forall l1<i<h+l. 


Let us now return to the matter of change of basis, and now let us as- 
sume that we have a linear transformation a: V — Y, where V is a vec- 
tor space of dimension n over a field F and Y is a vector space of dimen- 
sion k over F. We have bases B = {v1,..., Un} and D = {w1,..., Wy} of V. 
Choose a basis FE = {yj,..., ye} of Y. Then ®gr(a) is a matrix C = [c;;]. If 
Q = [qij] is the change of basis matrix from D to B then for each 1 < j <n 
we have o(wj) = a(S} gnjvn) = hy anja (vn) = ry Gj Okay Cini) = 
ae i Cingnj) Yi» and so Ppz(a) = CQ, showing that ®pz(a) and C are 
column equivalent. In the same manner, if we have another basis G = {z1,..., zx} of 
Y and if P = [pj;;] is the change of basis matrix from E to G, then z; = ar Dij Yi 
for all 1 < j <k. If ®gg(a) is the matrix C’ = [cj], then for all 1 < j <n we 


have (vj) = hay ehjZh = Dhar Cj Lint PinYi) = Dp par Piheng)yi and 
this equals yu Ciyi implying C = PC’, and so C’ = P~'C. Thus ®gg(a) 
and C are row equivalent. If we put both of these results together, we see that 
Ppg(a) = P-'@zgr(a)Q, and so ®pg(a) and @gz(q@) are equivalent. 


Example Let a : R? —> R? be the linear transformation given by 


a 
flbip ai 
Qa: pase 
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1 1 -+_1{1 
E 1 | an =4] 


wor (= 8)-o[ 4] (GL) 


, Fi: Note that P~'!®ge(a)O = @pg(a). 


Example We will now see an application of linear algebra to calculus. Let V be the 
vector space over R consisting of all infinitely-differentiable functions f €¢ R®, and 
let 5 € End(V) be the differentiation endomorphism. 

(1) If a and b are given real numbers, not both equal to 0, then the functions 


(2 


wm 


fo: xt e™ sin(bx) and fi : x bt» e™ cos(bx) belong to V and the subspace 

W =R{/fo, fi} of V is invariant under 5. The restriction of 6 to W can be 

represented with respect to the basis { fo, f,} of W by the nonsingular matrix 
= a —b i = g 2 —1 a b 

A= k Al It is easy to check that A7* = (a +b ) | § a There- 

fore, 


1 
/ fot) dt = 8" '(fo) = (<p) le — 6 and 


1 
[foaaoin= (pp) tan. 


The functions gg : x xe, giixt xe*, and gz: xb e* all belong to V 
and the subspace Y = R{go, g1, g2} of V is invariant under 6. The restriction 
of 5 to Y can be represented with respect to the basis {go, g1, g2} of Y by the 


1 0 0 1 0 0 
nonsingular matrix B=|2 1 0 |. Since B~! =| —2 1 0 |, we see 
0 1 1 2 -1 1 


that 


/ go(t)dt = 8 '(go) = go — 2g1 + 282, 
Jawa =6-'(g1)=g1—g2, and 


[awa 5 (g2) = go. 


Let us turn to problems connected with the implementation of this theory. Let 


F be a field and let n be a positive integer. Let A = [a;;] and B = [b;;] be- 
long to Mnyn(F) and let C = AB. In order to calculate each one of the n2 en- 
tries in C, we need n multiplications and n — 1 additions/subtractions, and so to 
calculate C we need n> multiplications and n? — 2n? + n additions/subtractions. 
Putting this in another way, the total number of operations needed to calculate 
AB from the definition is on the order of n°, where c = 3. If n is very large, 
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this can entail considerable computational overhead and leaves room for the in- 
troduction of significant error due to roundoff and truncation in the course of the 
calculation. It is therefore very important to find a more sophisticated method of 
matrix multiplication, if possible. One such method is the Strassen—Winograd algo- 
rithm. 


With kind permission of Volker Strassen (Strassen); With 
kind permission of the Department of Computer Science, 
City University of Hong Kong (Winograd). 

Variants of this algorithm were discovered by 
the contemporary German mathematician Volker 
Strassen and the contemporary Israeli mathe- 
matician Shmuel Winograd who later served as 
director of mathematical research at IBM. 


To illustrate the Strassen—Winograd algorithm, let us first begin with the special 
case n = 2. First, calculate 


Po= (au + aj2)(611 +512), pr = (a1. +.422)b11,  p2 = a1 (b12 — b22), 
P3 = (a2. — a1) (11 +12), pa = (411 +. 412)b22, ps = a22(b21 — 511), 
Po = (412 — 422)(b21 + b22), 

and then note that C = Pot Ps— Pat Pe P2 + Pa ]- tm is catew 


Pit Ps Po Pit p2+ P3 
lation, we used 7 multiplications and 18 additions/subtractions (Winograd’s variant 


of this algorithm uses only 15 additions/subtractions, but these are more interdepen- 
dent, and so the algorithm is less amenable to implementation on parallel comput- 
ers) instead of 8 multiplications and 4 additions/subtractions. In the early days of 
computers, when multiplication was several orders of magnitude slower than addi- 
tion, this in itself was a great accomplishment. If n = 4, we write our matrices in 
block form: A = be | and B= Pe “al where each block is a 2 x 2 
Az, A22 Bo, By 
matrix. We now calculate 2 x 2 matrices Po,..., P6 and then construct C = AB 
as above. To do this, we need 49 multiplications and 198 additions/subtractions, as 
opposed to 64 multiplications and 46 additions/subtractions if one goes according 
to the definition. We continue recursively. If n = 2", then the number of multipli- 
cations needed is M(h) = 7" and the number of additions/subtractions needed is 
A(h) = 6(7" — 4") and so M(h) + A(h) < 7"*!, (If n is not a power of 2, we can 
add rows and columns of 0’s in order to enlarge it to the desired size.) Thus, we see 
that the number of arithmetic operations needed to calculate AB is on the order of 
n°, where c < log, 7 = 2.807... and so, for large n, we have a definite advantage 
over multiplication following from the definition. Using even more sophisticated 
techniques, it is possible to reduce the number of arithmetic operations to the order 
of n°, where c < 2.376..., as was done by Winograd and Coppersmith in 1986. 
Recent results by American mathematicians Chris Ulmas and Henry Cohn, using 
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sophisticated group-theoretic techniques, suggest that c can be reduced still further, 
but their methods are not, as yet, practical for all but matrices of immense size. 

For sparse matrices—namely matrices in which a very large majority of the en- 
tries are 0—these algorithms can be combined with other sophisticated techniques 
to produce even faster multiplication. If the matrices are in My x»(F) but have no 
more than n nonzero entries, then one can multiply them in an order of n2*+*”™ 
operations, where k(n) > 0 asn > ow. 

The size of matrices for which the Strassen—Winograd algorithm is significantly 
faster than the regular method depends, of course, on the particular hardware on 
which it is being used. The Strassen—Winograd algorithm can also be modified to 
multiplication of matrices which are not necessarily square. 

Unfortunately, the Strassen—Winograd algorithm is no less susceptible to round- 
off and truncation errors than the regular algorithm. On a computer with seven-digit 
accuracy, the product 


211 2 3 4 50 0.32 0.0023 421 

1 2 3 4 60 0.023 0.033 982 
0.001 0.032 0.043 0.044 23 0.032 0.03 623 
311 0.0032 1233 0.0324 33 0.043 0.022 44 


10871 67.834 0.7293 92840 
371 0.634 0.2463 4430 
4.411 0.0043 0.0033 60.57 
43910.3 138.977 37.7061 899094.0 
matrix multiplication, whereas, using the Strassen—Winograd algorithm, we ob- 

10871 68.54 0.6294 92840 


; 370.9 1.0 0.2463 4430.18 . 
tain . This problem can be overcome to 
4.411 0.0043 0 62.0 


43910.3 139.047 37.7 899095.0 
some extent by stopping the recursion in the Strassen—Winograd algorithm early, 
and doing the bottom-level matrix multiplication using the ordinary method. An- 
other disadvantage of this algorithm is that it requires a much larger amount of 
scratch memory space to perform its calculations. 

There are other tricks that can be used to reduce the computations necessarily 
in matrix multiplication. For example, if n is a positive integer and if A, B,C, De 
Maxn(R), then the matrix product (A +iB)(C + iD) in Myxn(C) can be calcu- 
lated using only three matrix multiplications in M, (IR), rather than the expected 
four, by noting that 


equals using the ordinary method of 


(A+iB)(C +iD) = AC — BD +i[(A+ B)(C + D) — AC — BD]. 


If we have a parallel-processing computational system at our disposal, ma- 
trix multiplication can be done much more rapidly. There exist parallel algo- 
rithms to multiply two n x n matrices in an order of log(m) time, on the con- 
dition that we have n° processors working in parallel. Given the availability of 
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such parallel computational power, one can also invert a nonsingular n x n ma- 

trix in an order of log?(n) time. The first such algorithm was developed by Las- 

zlo Csanky in 1977, though this algorithm has the disadvantage of being wildly 
unstable. 

Again, we keep in mind that real and complex numbers are represented in a 
computer by approximations having a limited degree of accuracy. The longer cal- 
culations become, the error due to roundoff and truncation increases and limits the 
correctness of the calculations. It is possible to reduce the effect of roundoff and 
truncation errors as much as possible. Let us recall how our algorithm for inverting 
a matrix A worked: 

(1) We formed the matrix [J A] = [b;;]; 

(2) We interchanged the first row which one of the rows below it, if necessary, such 
that bj n41 #0; we then multiplied this row by b, 4, SO that this element is 
now equal to 1, and we subtracted multiples of this row from the rows below it, 
in order to make b;,,41 equal to 0 for all 1 <i <n. 

(3) We now go iterate this process for the elements bj ,4, where h = 2,3,... and 
so forth. If we cannot do it, ie., if there exists an A such that bj.,+4, for all 
h <i <n, the matrix A is nonsingular. Otherwise, at the end of the process, we 
have brought the matrix to the form [A-! 7]. 

The elements by ,+, are called pivots of the algorithm. If we are working over 
R or C, we can minimize roundoff and truncation errors, to some extent, by mak- 
ing sure that each time we interchange rows we choose to bring into the pivot po- 
sition a nonzero number having maximal absolute value. This strategy is known 
as partial pivoting. We could do better by also interchanging columns in order to 
bring into the pivot position by.»+n the element bj; (h <i, j <n) having maxi- 
mal absolute value. This strategy is known as full pivoting; it requires a certain 
amount of computational overhead on the side so that the columns can be returned 
to their proper positions at the end of the algorithm. Although there are matrices 
so pathological that full pivoting rather than partial pivoting is needed in order to 
invert them, most experts believe that it is not worth the effort and the computational 
overhead and that for such matrices one should use other methods altogether. Par- 
tial pivoting also does not work well on parallel or systolic-array computers, since 
it requires many nonlocal data movements. Several variants of pivoting strategies 
for matrices having specific structures have, however, been developed and are in 
wide use. 

Indeed, let us now consider another method. It is clearly easier to invert a nonsin- 
gular upper-triangular or lower-triangular matrix—namely a matrix in one of these 
forms all of the diagonal elements of which are nonzero. Therefore, our job would 
be much easier if we could write A in the form LU, where L is lower triangular 
nonsingular and U is upper triangular nonsingular, for then A~! = U~!L~!. This 
is not always possible. For example, one can see that there is no way of writing the 
matrix It o| € M2,.2(R) in this form. However, it is always possible to write A 
in the form LU when A equals a product of elementary matrices of the form Ej., 
and E;;,- only. 
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How can this be done? Assume that A = [a;;], U = [uj], and L = [v;;] and 
that A = LU, where U is upper-triangular nonsingular and L is lower-triangular 
nonsingular. Then for each | <i, j <n we have ajj = ye VinUnj- In each of L 
and U there are only 5(n? +n) entries which can be nonzero and so our problem 
is one of solving n* nonlinear equations in n? + n unknowns. This means that we 
can allow ourself to choose the value of 7 of these variables arbitrarily, and we will 
do so by insisting that vj; = 1 for all 1 <i <n. Now we have a system of n2+n 
nonlinear equations in n? +n unknowns, which can be solved by a method known 
as Crout’s algorithm: 

(1) First set vj; = 1 forall 1 <i <n; 
(2) Forall2 < j <n andall 1 <i < j, first calculate u;; = aj; — SS Vinunj and 


1 j-l ‘ : 
then Vij = zy (aii = ei VihUhj) forall j <i<n. 


With kind permission of the National Portrait Gallery (Tur- 
ing); With kind permission of Sir Peter Swinnerton-Dyer 
(Swinnerton-Dyer). 
The LU method was devised by the British mathe- 
matician Alan Turing who is better known as the 
founder of automata theory and one of the fathers 
of the electronic computer. It appears implicitly 
oe in the work of Jacobi on bilinear forms. The first 
computer algorithm to compute LU factorizations using partial pivoting was described by 
the contemporary British mathematicians D.W. Barron and Sir Peter Swinnerton-Dyer. 
Prescott Crout was a twentieth-century American mathematician. 


We note that if A is a nonsingular matrix which can be written in the form LU, 
where L = [v;;] is a lower-triangular nonsingular matrix satisfying vj; = 1 for all 
1 <i <n and U =[u;;] is upper-triangular and nonsingular, then this factorization 
must be unique. Indeed, assume that L; U; = L2U2 where the Ly, are lower triangu- 
lar matrices with 1’s on the diagonal, and the U;, are nonsingular upper-triangular 
matrices. Then L ‘hr, = UU 7 ! Since the product of lower-triangular matrices is 
lower triangular and the product of upper-triangular matrices is upper triangular, 
this matrix must be a diagonal matrix. But then L, ! L, =I andso L, = L2 and that 
implies that U; = U2, proving uniqueness. 


Example Some singular matrices may also be written in the form LU, but for them 
the above uniqueness result is no longer necessarily true. For example, 


1 -l 2 1 0 0 1 -l 
-1 1 -l1/=]-1 1 O 0 0 1 
2 -2 4 2 b 1 0 0 —b 


for any scalar b € R. 


As was previously remarked, not all nonsingular matrices can be written in the 
form LU. However, we have already noted that any nonsingular matrix A can be 
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written in the form PC, where P is a permutation matrix and C is a product of 
elementary matrices of the form E£;,. and E;;,;- and C can be written in the desired 
LU form. 


0 1 1 -3 
‘ : —2 4 1 4 
Example It is easy to verify that 000 t= PLU, where P = 
3 1 1 0 
0 1 0 0 1 0 0 0 
10 0 OO]. : : 0 1 0 07. : 
00011388 permutation matrix, L = 3 7 1 0\3 lower trian- 
0 0 1 0 0 00 1 
—2 4 1 4 
gular, and U = : ; “3 S is upper triangular. 
0 0 0 1 


In general, the problem of factorization of a square matrix into a product of ma- 
trices of a more desirable form is one which arises often in computational matrix 
theory, and many techniques have been developed to facilitate such computations. 
One method, for example, is to associate with any matrix A = [a;;] € Mnxn(F) an 
undirected graph I“4 the vertices of which are {1,...,} and in which there exists 
an edge connecting i and j if and only if a;; 40 or a;; #0. If this graph has nice 
structure—if it is a tree, for example—then this structure can be exploited to pro- 
duce efficient factorization algorithms for A, as has recently been shown by Israeli 
computer scientist Sivan Toledo. 


Exercises 


Exercise 441 


1 3 
Let F = GF(5). Calculate | 2 1 
1 2 


WS 


12 2 
4 3 2] inM3,3(F). 
14 2 


Exercise 442 
Does there exist a real number b such that the matrices 


10 -1 0 b -1 -1 0 
01 0 -1 St. be O44 
eS oO at pp) OP a oe et 
01 0 -1 0 1-1 §$ 


are a commuting pair in Ma,.4(R)? 
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Exercise 443 


Let F = GF(7) and let K be the subalgebra of M2,.2(F) consisting of all ma- 
trices of the form E ;| , for a,b € F. Show that K is a field. Is it a field if 


b 
F =GF(5)? 
Exercise 444 


Let A= : | € M2,.2(R). Find the set of all matrices B € M,,.,(R) satis- 


fying BA= AB. 


Exercise 445 


1 i 


Let A= Ee | € M>,.2(C). Find a complex number c satisfying (cA)* = A. 


Exercise 446 


0 1 
LettA=|]0 0 € M3x3(R). Find a positive integer k satisfying A‘ = A7!. 
1 0 


oro 


Exercise 447 


0 0 1 
Let F be a field. Find all matrices A € M3,.3(F) satisfying A7=|0 0 0 
0 0 0 


Exercise 448 

Let n be a positive integer and let F be a field of characteristic 0. Show that 
AB—BA#I forall A, B € My xy(F) (in other words, that J is not the product 
of any two elements of the Lie algebra Myx (F)7 ). 


Exercise 449 
Show that there are infinitely-many pairs (a,b) of real numbers satisfying the 


a 0 0 1 0 1 1 0 1 a 0 0 
condition} 0 1 0O 0 1 OyJ=];0 1 O 0 1 0 
0 0 b 1 0 1 1 0 1 0 0 b 
Exercise 450 
Does there exist a positive integer k satisfying 
0 1 0]fo 0 1}* fo 01 
0 0 1 0 1 0) =)}1 0 Of? 
1 0 0 1 0 0 0 1 0 
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Exercise 451 
Let F = GF(3). Show that there exist at least 27 distinct matrices A in M3,.3(F) 


satisfying A? = /. 


Exercise 452 
If F = GF(2), find the set of all pairs (A, B) of matrices in M,2(F) satisfying 
AB-—BA=I. 


Exercise 453 


For a field F, find {A € M2.(F) | A? = O}. 


Exercise 454 


1 1 -1 1 -1 3 
Find a matrix A € M3,3(R) satisfying A | 2 1 O;=] 4 3 2 
1 -l 1 1 -—2 5 


Exercise 455 


a 1 0 
Show thatifA=]0O a 1 | €M3,3(R) then for each n > 1 we have A” = 
0 0a 
gq” na} a 
0 a” na"—! | : 


Exercise 456 


Let (K, e) be an associative unital algebra over a field F and let S be the subset 
vi OK 3 
of M3,.3(K) consisting of all matrices of the form | Ox v22 Ox |. Is S an 


v31 OK 33 
F-subalgebra of M3,.3(K)? 


Exercise 457 


Let n be a positive integer and let F be a field. A matrix in M,,.,(F) of the form 


a; a2... an 
aA, a, oe) GAn-] ; : ; . . 

is called a circulant matrix. Determine if the set of all 
ag a3.iC««s«.. a 


circulant matrices in My x»(F) is an F-subalgebra of Mixn(F). 
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© Collections artistiques de |’ Université de Li¢ge. Reprinted with kind permission. 
Circulant matrices, which have many important applications, were 


first studied by the nineteenth-century French mathematician Eugéne 
Catalan. 


Exercise 458 


Let n be a positive integer and let F be a field. If A € M,y,(F) is a nonsingular 
circulant matrix, is A~! necessarily a circulant matrix? 


Exercise 459 
b 


a 
2b al’ 
where a, b € Q. Show that K is a Q-subalgebra of M2 x2(Q) which is, in fact, a 
field. 


Let K be the subset of M2.2(Q) consisting of all matrices of the form 


Exercise 460 
Find a matrix A € M2, (IR) satisfying A* = E iI. 


Exercise 461 


1 1 1 
Find all matrices A € M3,.3(R) satisfying A] 2 2 2]=0O. 
0 1 1 


Exercise 462 


Let A= ° 4 € M2x2(R) be an idempotent matrix. Show that a+deé 


d 
io: 1221, 


Exercise 463 


mt n-1 n-1 
Show that k 4 = Ee ae for alln > 1. 


Exercise 464 


Let F be a field and let A = k z 


| € Mox2(F). Find A” for alln > 1. 


Exercise 465 
Find matrices A, B € M2,.2(Q) for which 


(A — B)(A+ B) # A? — B?. 
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Exercise 466 


Let F be a field and let A = € M3,3(F). Show that Ak+? = A‘ + 


~ Orr 
— OO 
oro 


A? — I for all positive integers 


Exercise 467 
Let n be a positive integer and let F be a field. Let A, BE Myxn(F) satisfy 
A+ B=1T. Show that AB = O if and only if A and B are idempotent. 


Exercise 468 

Let n be a positive integer and let (K, e) be an associative unital algebra over a 
field F'. Define a new operation LE] on Myx»(K), called the Schur product (some- 
times also called the Hadamard product, especially in the context of statistics), 
by setting [v;;] LI) [wij] = [vij e wij], for all 1 <i, j <n. Is Maxn(K), +, 4) 
an F’-algebra? Is it associative? Is it unital? When is it commutative? 


Exercise 469 
Let n be a positive integer and for each A = [ajj] € Mnxn(R), let w(A) 
max, <j, j<n |aij|. Show that 4(A”) < nA)? for all A € Mnxn(R). 


Exercise 470 
0 1 0 
Let F be a field. Find a matrix A € M3,3(F) satisfying A7=|0 0 0 | or 
0 0 0 
show that no such matrix exists. 


Exercise 471 
Find a matrix A € M2 x2(Q) satisfying A b | Al= ls e for all 


c O 
cEQ. 
Exercise 472 


Let F be a field and let n be a positive integer. Show that H};AM\;BA\; = 
Ay, BHAA, for all A, Be Masxn(F). 


Exercise 473 
Let F be a field and let n be a positive integer. Show that (77) 0; Wij AHji)B 
= BOY i ee Hy; AH;jj) for all A, BE Maryn (PF). 


Exercise 474 
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Exercise 475 
Find infinitely-many triples (A, B, C) of nonzero matrices in M3,.3(Q), the en- 
tries of which are nonnegative integers, satisfying the condition A* + B? = C?. 


Exercise 476 
Let F be a field. Find a matrix A € Mq,.4(F) satisfying A+ = 1 4 A?. 


Exercise 477 

Let n be a positive integer and let F = GF(p) for some prime integer p. 
Show that for any A € M,xn(F) there exist positive integers k > h satisfying 
A* = A". Would this also be true if we chose F = Q? 


Exercise 478 
Let A = [ajj] € M2x2(C) be a matrix satisfying the condition that sla wt 


a22| A /a\1a22 — a\2a21. Show that there exist four distinct matrices B € 
Mox2(C) satisfying B? = A. 


Exercise 479 
Let c be a given complex number. Find the set of all matrices A € M2 x2(C) 
satisfying (A — c1)* = O. 


Exercise 480 
3 — 4c 2—4c 2—4c 
Show that | —1-+ 2c 2c —1+2c | is involutory for all complex num- 
—3+2c -—3+4+2c -—2+42c 
bers c. 


Exercise 481 

Let n be a positive integer and let F be a field. How many matrices A = [a;;] € 
Mnxn(F) having entries in {0, 1} satisfy the condition that each row and each 
column contain exactly one 1. 


Exercise 482 
Show that for an integer n > 4 and for a field F there exist matrices A and B in 
Mnxn(F) satisfying A* = B? = O but AB= BAZ O. 


Exercise 483 

Let F = GF(2) and let F’ be a field of characteristic other than 2. Define a 
function g : M2 2(F") > M2 2(F) as follows: If A = [ajj] € M2x2(F’) then 
set (A) = [b;;], where 


—s 1 if a;; 4 0, 
"10 otherwise. 


Is g(A + A’) = @(A) + @(A’) for all A, A’ € M2y2(F’)? Is (AA) = 
y(A)p(A’) for all A, A’ € M2y.2(F')? 
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Exercise 484 
Find a matrix J 4 A € M3,.3(Q) satisfying 


1 0 0 1 0 0 1 0 0 1 0 0 
Aj}1l 1 OJ=]1 1 OJA and A}O 1 O}J=]0 1 OJA 
0 0 1 0 0 1 01 1 01 1 


Exercise 485 
For each real number a, find a matrix B(a) € M2x2(R) satisfying 


ao cane) |= 8 1 | 
i 1 


sin(a) __cos(a) sin(a) 
or show that such matrices need not exist. 


Exercise 486 


5 


Let A= E 


” € Mox2(IR). What is A!94? 


Exercise 487 


Find all pairs (a, b) of rational numbers such that the matrix A = EE 4 E 


M2 x2(Q) is idempotent. 


Exercise 488 

Let F be a field and let n be a positive integer. Show that there do not 
exist nonsingular matrices P,Q € My xn(F) satisfying PAQ = A’ for all 
Ae Mahaxn(F). 


Exercise 489 
Let F be a field and let A, B € Myxn(F) be a commuting pair of matrices, 
where B is nonsingular. Is (A, B~') necessarily a commuting pair? 


Exercise 490 
Let F be a field. Is S = {|< | 
c ad 


M2 x2(F)? 


a+tc=b+ a| an F-subalgebra of 


Exercise 491 

Let F be a field of characteristic other than 2, let n be a positive integer, and let 
A € Mnxn(F) be an involutory matrix. For each c € F, let Be = c(A + 1). For 
which values of c do we have B? = B,? 


Exercise 492 


Find all rational numbers a, b, and d satisfying the condition that E | € 


Mo x2(Q) is involutory. 
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Exercise 493 
Let F = GF(3) and let A = : | € M3x3(F). Show that the subset 


{O,1,21,A,1+A,21+A,2A,1+2A,21+2A} of M3 3(F) isa field under 
addition and multiplication of matrices. 


Exercise 494 
Let F = GF(p), where p is a prime integer, and let K be the subset of M2,.2(F) 


b ’ , where a,b € F. Show that K, 


together with the operations of matrix addition and multiplication, is a field when 
p =3 and is not a field when p = 5. What happens when p = 7? 


oe ; a 
consisting of all matrices of the form 


Exercise 495 
Let n be a positive integer, let F be a field, and let O 4 A, BE Myxn(F). Show 
that there exists a matrix C € Myxn(F) satisfying ACB 4 O. 


Exercise 496 
Find all matrices A, B € M2,2(R), the entries of which are nonnegative inte- 
gers, which satisfy AB = ‘ ‘ 
Exercise 497 
Let V = M3,.3(Q). For each rational number f, let a, : V > V be the linear 


0 1 3 
transformation At A| ¢ 0 0 |. Is the function t +> a; a linear transfor- 
0 -l1 4 


mation from Q to End(V), both considered as vector spaces over Q? 
Exercise 498 


Let n be a positive integer, let F be a field, and for some fixed c € F, let A = [a;;] 
be the matrix in My x»(F) defined by 


c wheni + j is even, 
aij = . 
Wy 0 otherwise. 


Show that the subset {A, A2, A?} of Maxn(F) is linearly dependent. 


Exercise 499 


0001 
1001 

Let F = GF(2) and let A=] 1 9 go | © Maxa(F). Let L = (O}U 
0010 


{Ai |i >0} C Ma4y4(F). Show that L is closed under addition. Is L, under the 
usual definitions of addition and multiplication of matrices, a field? 
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Exercise 500 
Let K be the set of all matrices in. M2,.9(Q) of the form ; Be Show that 


K is a subalgebra of M22(Q) which is in fact a field. 


Exercise 501 
Find the set of all matrices A € M2,.2(Q) which satisfy A* + A = il 


Exercise 502 
Let A= 7 "| € M2 x2(Q) and let B and C be matrices in M2,.2(Q) satis- 
fying AB = BA and AC = CA. Show that BC=CB. 


Exercise 503 
Find infinitely-many matrices A € M3,.3(Q) satisfying 


1 -1 2 1 2 0 1 
A|2 O 1ly;==);0 2 -3 
2 <1 3) 210° 0 0 
Exercise 504 
1 -1 -!l 
LetA=|]| -l 1 —1 | € M3,3(Q). Find functions f and g from the set of 
-1 -l 1 


f(r) gin) gin) 
all positive integers to Q satisfying the condition that A” =} g(n) f(n) g(n) 
gin) gin) f(n) 


for alln > 1. 


Exercise 505 
Let F = GF(2). Do there exist matrices A = [a;;] and B = [b;;] in M2x2(F) 
satisfying a1; + da22 = 1, bj; + b22 =0, and AB=I/? 


Exercise 506 

Let F be a field and let G be the set of all matrices in M3 3(F) of the form 
1 0 0 
a 0 OO], where a,b € F. Is G closed under matrix multiplication? Does 
0 0 b 

there exist a matrix J in G satisfying the condition that AJ = A for all A € G? 

If such a matrix J exists, is it necessarily true that JA = A for all A € G? 


Exercise 507 
Let 1 be a positive integer and let F be a field. Let A and B be matrices in 


I A’ I B : ; : 
Maxn(F) of the form Oo 1 and o 1 respectively, where A’ and B 
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are (not-necessarily square) matrices of the same size. Find necessary conditions 
for A and B to satisfy AB = BA. 


Exercise 508 
Let F be a field and let A, B € M2x2(F). Show that (AB — BA)? isa diagonal 
matrix. 


Exercise 509 

Let n be a positive integer and let F be a field. Let A € Myy.,(F) be a diagonal 
matrix having distinct entries on the diagonal. Let B € My x»(F) be a matrix 
satisfying AB = BA. Show that B is also a diagonal matrix. 


Exercise 510 

Let n be a positive integer and let F be a field. For each integer —n <t <n, let 
D;(F) be the set of all matrices A = [ajj] € Mnxn(F) satisfying the condition 
that a;; =0 when j #i +1. Thus, for example, Do(F’) is the set of all diagonal 
matrices in My x,(F). If A € D;(F) and B € D;(F), does there necessarily exist 
an integer —n <u <t such that AB € D,(F)? 


Exercise 511 


1 2 3 1 2 3 
Let A=] —-1l -—2 -—3] and B=J]0O O O|] be matrices in M3,.3(R). 
2 4 6 0 0 0 


Find infinitely-many lower-triangular matrices C satisfying A= CB. 


Exercise 512 

Let n be a positive integer and let F be a field. Let Aj, ..., A, be upper-triangular 
matrices in Myx»(F) satisfying the condition that the (i,7)-entry in A; is equal 
to 0 for 1 <i <n. Show that A,---A, = O. 


Exercise 513 

Let F be a field in which we have elements a 4 0 and b. Show that there exists 
an upper-triangular matrix C € M2x2(F) satisfying E | C= E a IsC 
necessarily unique? 


Exercise 514 
Let F be a field. Find an element A of M2,2(F) satisfying AA’ # A? A. 


Exercise 515 
Let F be a field and let n > 1. If a matrix A € M,y»(F) satisfies AA’ = O, 
does it necessarily follow that A? A = O? 


Exercise 516 
Let n be a positive integer, let F be a field, and let A € My xn(F) satisfy the 
condition A = AA’. Show that A? = A. 
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Exercise 517 
Let n be a positive integer, let F be a field, and let A, B € M,x,(F) be symmet- 
ric matrices. Is ABA necessarily symmetric? 


Exercise 518 
Let n be a positive integer and let F be a field. If A € Myy,(F) is symmetric, is 
A” symmetric for all h > 1? 


Exercise 519 


1 -—2 1 3 -l1 1 : 
Show that {| 2 | ; E ‘| ; 1 El forms a basis for the subspace 


of M2 x2(Q) consisting of all symmetric matrices. 


Exercise 520 


Does there exist a matrix A € M),.2(R) satisfying AA! = E | ? 


Exercise 521 
Given real numbers a, b, and c, find all real numbers d such that 


0 oO oO -Il a bec l 
0 oO -l a 1 0 0 0 
0 -l a eob 0 d 0 0 
-l a ob ce 0 0 1 0 


is symmetric. 


Exercise 522 
Find a matrix B € M2 .2(Q) such that the Nievergelt’s matrix equals B'B. 


Exercise 523 


1 2 -3 
Calculate | 0 1 2 in. M3 3(R). 
0 0 1 
Exercise 524 
-1 
a 1 
Leta € Rw {1, —2}. Calculate} 1 a 1 € M3x3(R). 
1 la 
Exercise 525 
3 4 0 


Does there exist an a € R such that 8 5 —2 | € M3,3(R) is singular? 
a —7 6 
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Exercise 526 

Let n be a positive integer. Each complex number c defines a matrix A(c) = 
Laij] € Mnxn(C) given by aj; = c@-DG-D for all 1 <i, j <n. Ifw=e?"!/"e 
C, show that A(w) is nonsingular and satisfies A(w)7! = 1 A(wo!). 


Exercise 527 
Let n be a positive integer and let F be a field. Given a matrix B € Mpxn(F), 


: ; —Bv |. 
do there exist vectors u, v € F” such that the matrix T T is non- 
—u° Bu’ Bu 
singular? 
Exercise 528 


1+x -x 


Is the matrix x 1-x 


€ M2x2(Q[X]) nonsingular? 


Exercise 529 

2 1-a 0 

Is the matrix 0 l—a* 1—a | €M3,3(C) nonsingular, where a = 
l-a 0 L=@ 

—5+4/-3€C. 


l-a 


Exercise 530 
Let n be a positive integer and let F be a field. If A € My xn(F) nonsingular, is 
the same necessarily true for A + A’? 


Exercise 531 

Let n be a positive integer and let F be a field. Let A = [a;;] € Myxn(F) satisfy 
the condition that )~"_, a; j = 1 for all 1 < j <n. Show that the matrix J — A is 
singular. 


Exercise 532 
Let n be a positive integer and let F be a field. If A € My xn(F) is a Markov 
matrix, is A~! necessarily a Markov matrix? 


Exercise 533 
Let n be a positive integer and let F be a field. For A € Myxn(F), show that AZ 
is nonsingular if and only if A? is nonsingular. 


Exercise 534 
Let F = GF(p), where p is a prime integer, and let n be a positive integer. What 
is the probability that a matrix in. M,,(F), chosen at random, is nonsingular? 
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Exercise 535 

0 -l 
1 -1 
Mo .2(R). Set B= AQ7'PQ. Show that B is nonsingular and A!l+B'= 
(A+ By}. 


Let P= € M2x2(R) and let A and Q be nonsingular matrices in 


Exercise 536 
Show that there are infinitely-many involutory matrices in M2,2(Q). 


Exercise 537 
Let F = GF(2). Is the sum of all nonsingular matrices in. M>,.2(F) nonsingular? 


Exercise 538 
Let F be a field and let U be the set of all nonsingular matrices in M2,2(F). Is 
the function @ : U — U defined by 6: At A? a permutation of U? 


Exercise 539 
Let n be a positive integer, let F be a field, and let A € Moy x2, (F) be a matrix 


which can be written in the form ie 4 | where each Ajj € Maxn(F) is 


nonsingular. Is A necessarily nonsingular? 


Exercise 540 
Let n be a positive integer and let F be a field. Do there exist matrices A, B € 


2 
Mnyxn(F) such that the matrix ae 


BA B2 € Monx2n(F) is nonsingular? 


Exercise 541 
Let n be a positive integer and let F be a field. For A, BE Myyn(F) with A 
nonsingular, show that (A + B)A~!(A — B) =(A— B)A7!(A+ B). 


Exercise 542 

Let n and p be positive integers and let F be a field. Let A € Myx»(F) and let 
B,C € Myx p(F) be matrices satisfying the condition that A and (J + C’A~'B) 
are nonsingular. Show that A + BC’ is nonsingular, and that 


(Ata) SAT =A RSC Ae) Ca 
Exercise 543 
Let n be a positive integer and let F bea field. If | : | #Au € F”, show that there 


0 
exists a nonsingular matrix in M,,.,(F) the rightmost column of which is v. 
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Exercise 544 
Let F be a field. Show that every nonsingular matrix in. M2,2(F) can be written 


as a product of matrices of the form i I : i or | 


a 0 
1 0 0 1 ]foraer, 


0 1 


Exercise 545 


1 O t 
For each real number f, let A(t) = t 1 xt? M3 3(R). Show that 
0 O 1 


each such matrix is nonsingular and that the set of all such matrices is closed 
under taking products. 


Exercise 546 

Let n be a positive integer and let F be a field. Let A € My xn(F) be a matrix for 
which there exists a positive integer k satisfying A‘ = O. Show that the matrix 
I — A is nonsingular and find (J — A)7!. 


Exercise 547 

Let n be a positive integer and let F be a field. Let A € Myxn(F) be a matrix for 
which there exists a matrix B € My x»(F) satisfying J + A + AB = O. Show 
that A is nonsingular. 


Exercise 548 

Let n be a positive integer and let F be a field. Let A, B € My x(F) satisfy the 
condition that A and A + B are nonsingular. Show that / + A~!B is nonsingular 
and that (J + A~!B)-! =(A+ B)7!A. 


Exercise 549 
Find matrices A and B in M>,.2(R) satisfying A* = B? = O such that A+iB 
is a nonsingular matrix in. M2 x2(C). 


Exercise 550 


Let F be a field and let A = 


2 OF 


0 b 
1 0} €M3x3(F), where ab 4 1. Show that 
0 1 
A is nonsingular and calculate A~!. 


Exercise 551 


Let c 4 0 be an element of a field F and let A = € Maxa(F). 


ooon 
coon = 
oo FO 
te) 


Is A is nonsingular? If so, find A~!. 
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Exercise 552 

Letn > 1 and let B€ Myx» (Q) be the matrix all of the entries of which are equal 
to 1. Show that there exists a matrix A € M,,x,(Q) satisfying the condition that 
A-+cB is nonsingular for all rational numbers c. 


Exercise 553 
Let n > 1 and let BE My xn(Q) be the matrix all of the entries of which are 
equal to 1. Find a rational number ¢ such that (J — B)-!=1-tB. 


Exercise 554 
Let n be a positive integer and let A = [ajj] € Mnxn(R) be the matrix defined 
by aj; = min{i, j} for all 1 <i, j <n. Show that A is nonsingular. 


Exercise 555 
Let A = [ajj] € M4 4(R) be the matrix defined by 


2 ifi=j-1, 
“ij =~) 1 otherwise. 


Show that A is nonsingular and calculate A~!. 


Exercise 556 
cos(a)  sin(a) 
— sin(a) oe € M2x2(R). Given 
G(a) G(b) 
O G(c) 


For each real number a, let G(a) = 


real numbers a, b, and c, show that G(a, b,c) = € Max4(R) is 


nonsingular, and find G(a, b, oh. 


Exercise 557 
Find a singular matrix in M3,.3(Q) the entries of which (in some order) are the 
integers 1,2,...,9. 


Exercise 558 
Let n be a positive integer and let F be a field. Given elements b,c € F, let 
A= [aij] € Mnxn(F) be the matrix defined by 


8 b ifi=j, 
“i =) 6 otherwise. 


Find necessary and sufficient conditions for A to be nonsingular. 
Exercise 559 


Give an example of a singular matrix in M3,.3(Q) the entries of which are dis- 
tinct prime positive integers, or show that no such matrix can exist. 
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Exercise 560 


Let F be a field and let D = | : 


1 0 
the condition that A? DA = D. Show that A is nonsingular. 


€ Mox2(F). Let A € M2x2(F) satisfy 


Exercise 561 
Let n be a positive integer and let F be a field. Is the set of all singular matrices 


in Mnxn(F) closed under taking products? 


Exercise 562 


Let n be a positive integer, let F be a field, and let A, BE Myx» (F'). Show that 


“| € Monxan(F) 


A and B are both nonsingular if and only if the matrix ‘ B 


is nonsingular. 


Exercise 563 


Write the matrix : 7 € M2x2(R) as a product of elementary matrices. 


Exercise 564 


Find the change of basis matrix from the canonical basis B of R? to the basis 


1 1 1 
D= 1],/1],)0 and the change of basis matrix from D to B. 
1 0 0 


Exercise 565 


Let G= é ; | 04a e€Ry}. Show that there exists a matrix E € G satisfy- 


ing the condition that EA = A= AE for all A € G. For each A € G, show that 


there exists a matrix A* € G satisfying AA* = E = A*A. 


Exercise 566 
Let F be a field. Given matrices A, B € M2,.2(F), find the set of all matrices 


C € Mx2(F) satisfying (AB — BA)C = C(AB — BA). 
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Exercise 567 
Let F be a field and let G be the set of all automorphisms of F? which are rep- 


i : i : a b 
resented with respect to the canonical basis by a matrix of the form ij. aot | 


Is G a group of automorphisms of F*? 


Exercise 568 
Let G be the set of all automorphisms of Q? which are represented with respect 


: 4 | where a,d>0.IsGa 


to the canonical basis by a matrix of the form 0 d 


group of automorphisms of Q?? 


Exercise 569 

Let W; C W2 C--- C W, be a fixed sequence of subspaces of a vector space V 
finitely generated over a field F. If a € Aut(V), we say that given sequence 
is an a-fan if and only if each of the W; is invariant under a. Show that 
G = {a € Aut(V) | the given sequence is an a-fan} is a group of automorphisms 
of V. 


Exercise 570 

For any real number ¢ and any positive integer n, we can define the matrix 
P(n,t) € Mnxn(R) to equal the identity matrix J in the case t = 0 and oth- 
erwise to equal the matrix [ p;;] defined by 


0 if i <j, 
Pij = (‘2-4 otherwise. 
j 


Show that P(n,s)P(n,t) = P(n,s +1) for all s,t € R. In particular, show that 
each matrix P(n, t) is nonsingular. 


Exercise 571 
Let F be a field and let X be an indeterminate over F’. Find matrices P and Q in 


2 
M2 x2(F[X]) such that the matrix P fone x 


xX (2. | Q is a diagonal matrix. 


Exercise 572 
Let n be a positive integer and let a: May2(C) ~ M4x4(R) be the function 


a b c d 

_jatbi ct+di —b a —-d c ee 

defined bya: | 27 | a # ok . Show that a is a lin- 
-f e@ —h g 


ear transformation of vector spaces over R. Is it a homomorphism of unital R- 
algebras? 
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Exercise 573 
Let F be a field and let A € M2 x2(F). Explicitly find a nonsingular matrix 
P €Mox2(F) satisfying PAP~! = A’. 


Exercise 574 

Let Y be the subspace of M3,3(R) consisting of all skew-symmetric matrices. 
Show that Y is isomorphic to R? and find an isomorphism a : R? — Y satisfying 
the condition that a(v)w = v x w for all v, w € R?. 


Exercise 575 


Let A = E ‘| € M>,2(R). Does there exist a matrix B € M>,.2(R) sat- 
isfying B* = A? Does there exist a matrix C € M4,.4(R) satisfying C? = 
A O 
9 
O Al’ 
Exercise 576 
1 1 0 0 
0 1 0 0 : i ‘ 
Let A= 0011 € Ma4,.4(Q). Find the set of all monic polynomials 
000 1 


D(X) € Q[X] of degree 2 satisfying the condition that p(Ay* = A. (Caution: this 
set may be empty.) 


Exercise 577 

(Simpson’s rule) Let a < b be real numbers and let c = s(a +b) be the midpoint 

of the interval [a, b]. Given a continuous function f € R'"!, use Lagrange inter- 
; b : ; = 

polation to show that /’ q J (t) dt is approximately equal to a [f(a)+4f(ce)+ 

f (6). 


The eighteenth-century British mathematician Thomas Simpson was 
noted for his work on numerical approximations in calculus. 


Exercise 578 
Let F be a field and let k <n be positive integers. Let A € My xn(F) be writ- 


Ai Aj 


, Where Aj; is nonsingular. Let v, w € Fk 
Ai | e 


ten in block form as 
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=[ Ai Ay Ap 
Ap Ac! Ax — Ari Aj; A12 


s[sJe[2]emomee(3]=[2] 


and v’,w’ € F"-* and let B | Show that 


Systems of Linear Equations 1 O 


Let k and n be positive integers. The classical problem of linear algebra is to find all 
solutions (if any exist) to a system of k linear equations in n unknowns of the form 


ayXy +--+ +ainXn = 1, 
az, X1 +--+ +ay,Xn = bo, 


ay X1 + +++ + aknXn = bx, 


where the a;; and the b; are scalars belonging to some field F and the Xj; are 
variables which take values in the field. 

What about infinite systems of equations? The study of infinite systems of linear 
equations over R was indeed initiated by Hill and formalized by Poincaré but has 
since been subsumed into functional analysis and will not be considered here. It is 
known that every finite subsystem of an infinite system of linear equations over an 
arbitrary field F has a solution over F if and only if the infinite system has a solution 
over F’. 


With kind permission of the American Mathematical Soci- 
ety (Hill); With kind permission of the AIP Emilio Segre Vi- 
sual Archives, Physics Today Collection and Tenn Collection 
(Poincaré). 

George William Hill was a nineteenth-century 
American mathematical astronomer. French 
mathematician Jules Henri Poincaré was one of 
the foremost mathematical geniuses of the late 
nineteenth century. 


Example Let a < b be real numbers and let V = C(a, b). If W is a subspace of 
V of dimension n then the interpolation problem of V is the following: given a 
function f € V and given real numbers a < t) <--- <t, <b, finda function g « W 
satisfying f(t;) = g(t;) for 1 < j <n. If we are given a basis {g1,..., gn} of W 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 189 
DOI 10.1007/978-94-007-2636-9_10, © Springer Science+Business Media B.V. 2012 
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then we want to find real numbers cj,..., c, satisfying ei cigi(tj) = f(t;) for 
all 1 < j <n. In other words, we want to solve a system of linear equations of the 
above form, where k =n, aj;j = gj (t;) and bj = f(¢;) for all 1 <i, j <n. 


Example In Proposition 4.2, we noted that if F is a field and if f(X) and g(X) £40 
are elements of F'[X], then there exist unique polynomials u(X) and v(X) in F[X] 
satisfying f(X) = g(X)u(X) + v(X) and deg(v) < deg(g). If we set 


k n 
g(X)=)oajX' and f(x)= dix’, 
i=0 i=1 
then the coefficients of u(X) = A c;X' are found by solving the system of linear 
equations 


agYo + ap—-1Y¥1 +--+: + aoVe = bx, 
agY + ap-1Yo +--+ + a0Ve41 = de-1, 


AY n—k—1 + Ak—1Yn—k = On-1, 
a ¥n—k = bn 


by any of the methods we will discuss. 


Example Sometimes we can transform systems of nonlinear equations into systems 
of linear equations. For example, suppose that we want to find positive real numbers 
r1, 2, and r3 satisfying the following nonlinear system of equations: 


ryror3 = 1, 


r3/rir2 = 81. 


Since each of the integers on the right is a power of 3, we can take the logarithm to 
the base 3 of both sides of each equation. Setting X; = log3(r;) for 1 <i < 3, the 
system now becomes linear 


X1+X2+ X3=0, 
3X, +2X724+2X3 =3, 
—X, — X2+ X3 =4, 


and this has a unique solution (which we can find by methods to be discussed in 
this chapter) X; = 3, X2 = —5, and X3 = 2, showing that the original system has a 
solution ry = 27, r2 = 1/243, and r3 = 9. 


A system of linear equations of the above form is homogeneous if and only if 
b; = 0 for all 1 <i <k; otherwise it is nonhomogeneous. At this stage, we do not 
yet know answers to the following questions: 
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(1) Does a given system of linear equations have a solution? 

(2) If it has a solution, is that solution unique? 

(3) If the solution is not unique, can we characterize the set of all solutions? 

(4) If there are solutions, how do we compute them efficiently? 

In order to answer these questions, we have to move to the language of matrices. The 
use of matrices for this purpose was developed in Europe in the nineteenth century 
by Cayley, Sylvester, and Laguerre. However, the real pioneers were the Chinese 
and Japanese mathematicians. During the time of the Han dynasty in China, around 
2000 years ago, the Nine Chapters on the Mathematical Art (Jiuzhang Suanshu) 
presented a method for solving systems of linear equations using matrices. A major 
commentary on this was subsequently written by Liu Hui. This, in turn, formed the 
basis for the later work of Seki. 


Edmond Laguerre, a nine- 
teenth-century French math- 
ematician, wrote an impor- 
tant book on systems of lin- 
ear equations in 1867. Liu 
Hui lived in the third cen- 
tury in the Kingdom of Wei 
in north-central China. He 
added proofs and computational algorithms using counting rods. Takakazu Seki Kowa 
was a seventeenth-century Japanese mathematician, the son of a samurai warrior family, 
who developed matrix-based methods based on Chinese texts. 


To see how this is done, let us write the above system in the form 


aito--. Qin XxX] bi 


Akt «+» kn Xn br 


The matrix A = [ajj] € Mkxn(F) is the coefficient matrix of the system. If we set 


by 
w=]: /eE FE, then the matrix [A w] € Mxx(n41)(F) is called the extended 
Dk 
d\ 
coefficient matrix of the system. The set of all vectors v= | : | € F” satisfying 
dn 


Av = w is the solution set of the system. This is clearly equal to a~!(w), where 
a: F" — FF is the linear transformation satisfying ®gp(a) = A, where B and D 
are the canonical bases of F” and F*, respectively. In particular, if the system is 
homogeneous then its solution set is just the kernel of @, and is called the solution 
space of the system. 
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We note the following simple but important point: if F is a subfield of a field K 
and if k and n are positive integers, then any matrix A in M;,.,(F) also belongs 
to Mxxn(K) and any vector v € F” also belongs to K”. Therefore, if w € FF, any 
element of the solution set of Av = w, considered as a system of linear equations 
over F’, remains a solution when we consider this as a system of linear equations 
over K. 


Proposition 10.1 The solution set of a homogeneous system of linear equa- 
tions inn unknowns is a subspace of F”. 


Proof This is a direct consequence of Proposition 6.4. 


For nonhomogeneous systems, the situation is a bit more complicated. 


Proposition 10.2 Let AX = w be a nonhomogeneous system of linear equa- 
tions in n unknowns over a field F and let vo € F" be a solution to this sys- 


tems. Then the solution set of the system is the set of all vectors in F” of the 
0 


form vo + v, where v is a solution to the homogeneous system AX = 


Proof This is an immediate consequence of Proposition 6.6. 


We should emphasize that the solution set of a nonhomogeneous system of linear 
equations is not a subspace of F” but rather an affine subset of that space. 


Example If we identify R* with the Euclidean plane by associating each vector | 


with the point with coordinates (a, b), then we see its subspaces of dimension | are 
precisely the straight lines going through the origin. The solutions of linear equa- 
tions of the form a; X1 + a2X2 = b, where b ¥ 0, and at least one of the a; is also 
nonzero, are the straight lines in the plane which do not go through the origin. 


We are still left with the question of how to actually find a solution to a system 
of linear equations. Here we can distinguish between two approaches: 

Direct Methods These methods involve the manipulation of the matrix A, either 
replacing it with another matrix which is easier to work with or factoring it into a 
product of matrices which are easier to work with, and thus reducing the difficulty 
of the problem. 

Iterative Methods These methods involve selecting a likely solution for the system 
and then repeatedly modifying it to obtain a sequence of vectors which (hopefully) 
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will converge to an actual solution to the system. Such methods work, of course, 
only if our vector space is one in which the notion of convergence is meaningfully 
defined. As we shall see, this is possible when the field of scalars is R or C. 

We begin by looking at direct methods. Let P be a nonsingular matrix in 
Mexk(F). A vector v € F” is a solution to the system AX = w over F if and 
only if it is a solution to the system (PA)X = Pw. In particular, this is true for 
elementary matrices. Thus, given a system of linear equations, we can change the 
order of the equations, multiply one of the equations by a nonzero scalar, or add a 
scalar multiple of one equation to another, without changing the solution set of the 
system, so long as we do the same thing on both sides of the equal sign. In order to 
do this efficiently, it is best to work with the extended coefficient matrix [A w] and 
perform elementary operations on it to reduce it to a convenient form. 

Let F be a field, let k and n be positive integers, and let B = [b;;] € Meixn(F). 
The matrix B is in row echelon form if and only if for each 1 <i <k there exists an 
integer 1 < s(@@) <n-+ 1 such that 
(1) bj; =0 for all 1 < j < s(é) but js) AO if s@) <n; and 


(2) s(1) < s(2) <--- <s(k). 
| and 
0) 
4 
1 
7 


677 1 8 0 0 0 0 
: 9 21 1 000 2 6 : 
Example The matrices 0022 000 0 0| Hf in row 
000 1 000 0 0 


1 
0 
0 
0 
1 
echelon form. The matrix E is not in row echelon form. 
0 


Example If n is a positive integer and if B € My xn(F) is in row echelon form, 
102 7 
. ; 0 0 3 8]. . 
then B is surely upper triangular. However, 009 9| 8a upper-triangular 
0 0 0 5 


matrix which is not in row echelon form. 


We claim that for any matrix A = [ajj] € Mxxn(F) is row equivalent to a matrix 
in row echelon form. By Proposition 9.4, this is equivalent to saying that A can be 
transformed into a matrix in row echelon form by a series of elementary operations, 
as follows: 

(1) Find the leftmost column of A which has a nonzero entry and interchange rows 
if necessary, so that this entry is in the first row. Thus we now have a matrix A 
in which aj, 4 0 and aj; = 0 for all 1 <i <k andalll <j <h. 

(2) For each 1 <i <k, if aj, 40 then we multiply the first row by ~a;ja;,, and 
add it to the ith row, which creates a new row in which the (7, /)-entry is equal 
to 0. Thus, we now have a matrix in which a;, = 0 for all 1 <i <k. 
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(3) Now consider the submatrix of A from which we deleted the first row and the 
first h columns, and repeat the above procedure. 


1 23 1 
Example Let us begin with the matrix A = | 2 1 4 2) €M3,.4(R). We al- 
1 -1 1 1 
ready have a1; 4 0. Multiplying the first row by —2 and adding it to the second 
1 2 3 1 
row, we obtain | 0 —3 —2 0 | and then multiplying the first row by —1 and 
1 -l 1 1 
1 2 3 1 
adding it to the third row, we obtain | 0 —3 —-—2 0 |. We also already have 
0 -3 —2 0 


az72 # 0. Multiplying the second row by —1 and adding it to the third row, we obtain 
1 2 3 1 
QO -—3 -—2 0 }, and this is in row echelon form. 
0 oO OO 0 


If A = [aij] € Mxxn(F) is a matrix in row echelon form, and if the hth row of A 
contains nonzero entries, then the leftmost nonzero entry of the row is the leading 
entry. The matrix A is in reduced row echelon form if it is in row echelon form and, 
in addition, satisfies the following additional conditions: 

(1) The leading entry in each nonzero row is equal to 1; 

(2) If ap; is a leading entry, then a;; = 0 for alli Fh. 

Any matrix in row echelon is row-equivalent to one in reduced row echelon form; 
that is to say, such a matrix can be converted to one in reduced row echelon form by 
performing additional elementary operations: first, we multiply each nonzero row 
by the multiplicative inverse of its leading entry, to obtain a matrix in which the 
leading entry of each nonzero row equals 1. Then, if ay; is a leading entry and if 
i <h, we multiply the hth row by —a;; and add it to the ith row, which will give 
us a matrix with the (i, j)-entry equal to 0. The reduced row echelon form of any 
given matrix is clearly unique. 


1 2 3 1 
Example Let us go back and look at the matrix |} 0 -—3 —2 0 |} inrowechelon 
0 oO O0O 0 
form. The leading entry of the first row is already equal to 1. Multiplying the second 
12 3 1 
row by —} to obtain, | O 1 3 O |, a matrix in which the leading entry of the 
00 0 0 
second row is equal to 1 as well. Now multiply the second row by —2 and add it to 
1o§ 1 
the first row, to obtain | Q 1 5 0 |, which is in reduced row echelon form. 
0 0 0 0 
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Example Even this very simple algorithm can lead to computational problems. Let 
n be a positive integer and let A = [ajj] € Mnxn(R) be the matrix defined as fol- 
lows: 
1 ifi=jorj =n, 
aij = —-1 ifi >i, 
0 otherwise. 


If we use the above method to reduce A to reduced row echelon form we obtain a 
matrix B = [b;;] where 


1 ifi=j <n, 
bp=42 forjon, 
0 otherwise. 


If n is sufficiently large, the element b,, may be considerably corrupted due to 
roundoff and truncation error. 


Reduction of a matrix in Mxx»(F) to reduced row-echelon form depends 
strongly on the fact that every nonzero element in a field has a multiplicative in- 
verse. If we are considering matrices in Mz x»(K), where K is the unital commuta- 
tive associative algebra of polynomials in one or several variables over F’, this now 
longer holds. In such situations, however, it is possible to reduce a matrix to a form 
known as Howell Canonical Form, which is equivalent to row-echelon form with 
leading entries equal to | in the case we are working over a field. This is important 
for computations since, as we will see, algebras of the form Mz.,(F[X]) have an 
important part to play in the theory we are developing. 

Now let us return to the system of linear equations AX = w in n unknowns 
and consider methods of solution. The most well-known is Gaussian elimination or 
the Gauss—Jordan method. In this method, we first perform elementary operations 
on the extended coefficient matrix [A w] to bring it to reduced row echelon form. 
Having done this, we now have a new system of linear equations A’X = w’, the 
solution set of which is the same as that of the original system. Let ¢ be the greatest 
integer i such that the ith row has nonzero entries. There are several possibilities: 
(1) b; £0 but ai; = 0 for all 1 < j <n. Then the system has no solutions, and we 

are done. 

(2) There is precisely one index j such that a’ j # 0. Then this must in fact be the 
leading entry of the th row and so a; = 1. This means that in any element of 
the solution set of the system we must have the jth entry equal to b;. We can 
therefore substitute b; for Xj; in each of the other equations, and reduce the 
system to one of equations of n — 1 unknowns. 

(3) There are several indices j such that a, j 4 0, say those in columns hy; < hz < 
+++ <hy». Then Ain, is the leading entry of the ¢th row and so equals 1. More- 
over, for any values z1,...,Zm Wwe substitute for Xy,,...,Xn,,, we will get 
a solution to the system with these values and with b; — }7"_, zs substituted 
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for X;,,. Thus we can consider the z; as parameters of a general solution and 
again reduce the system to one in a smaller number of unknowns. 


(4) Having reduced the system, we now recursively apply the previous steps until 
the system is solved. 


With kind permission of the Archives of the Mathematis- 
ches Forschungsinstitut Oberwolfach © Universitat Gottin- 
gen, Sammlung Sternwarte 

Carl Friedrich Gauss, who lived in Germany at 
the beginning of the nineteenth century, is con- 
sidered to be the leading mathematician of all 
times, as well as a physicist and astronomer of the 
first rank. He developed this method in connec- 
tion with his work in astronomy in 1809. Gaussian elimination first appeared in print in 
a handbook by German geodesist Wilhelm Jordan, who applied the method to problems 
in surveying. The first computer program to solve a system of linear equations by Gaus- 
sian elimination was written by Lady Augusta Ada Lovelace, a student of De Morgan and 
daughter of the poet Lord Byron, who developed software for Charles Babbage’s (never 
completed) mechanical computer in the nineteenth century. Her program was capable of 
solving systems of 10 linear equations in 10 unknowns. 


Strassen’s insight that Gaussian elimination may not be the optimal method of 


solving systems of linear equations, as had been previously thought, led to the de- 
velopment of his method of matrix multiplication. 


Example Let us consider the system of linear equations 


3X, +2X7+ X3=0, 
—2X,+ X2— X3=2, 
2X, — X2+2X3=-1 


over the field R. The extended coefficient matrix of this system is 


Le iO 
and this is row equivalent to the matrix | 0 5 a 6 | 1n row echelon form, 
0 0 1 1 
1 00 -!1 
which is in turn row equivalent to the matrix |0 1 0 1 | in reduced row 
0 0 1 1 


echelon form. Thus we see that the solution set of the system is 1 
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Example Let us consider the system of linear equations 


X,+X2=1, 
X, — X2 =3, 
—X,+2X.=-2 


over the field R. The extended coefficient matrix of this system equals 


1 
-1 Sis 
- —2 
1 1 
and this is row equivalent to the matrix ah : in row echelon form, which is 
1 0 ral 
row equivalent to the matrix | 0 1 0 | in reduced row echelon form. Therefore, 
0 0 1 


this system has no solutions at all. 


Example Let us consider the system of linear equations 


X,+2X2+ X3=-1, 
2X, +4X2+ 3X3 =3, 
3X) +6X2+4X3 = 2 


12 1 -1 
over R. The extended coefficient matrix of this system is | 2 4 3 3 | and 
3 64 2 
121 -1 
this is row equivalent to | 0 0O 1 5 | in row echelon form, which is in turn 
00 0 O 
1 2 0 -6 
row equivalentto| 0 O 1 | in reduced row echelon form. From the second 
000 O 
row, we see that we must have X3 = 5. From the first row, we have Xj + 2X2 = —6 
and so, for each value X2 = z, we have a solution with X; = —6 — 2z. Therefore, 
—6—2z 
the solution set to our system is Zz zER 


5 
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Gaussian elimination can also be used to check if a set of vectors in F* is linearly 
aj 
independent. Let {v;,..., v,} be a set of vectors in F k where vji= : for all 7. 
kj 
We want to know if there are scalars bj,..., b, in F, not all equal to 0, satisfying 
0 


ee bjvj =| : |. Thatis, we want to know if the homogeneous systems of linear 


equations AX = | : | has a nonzero solution, where A = [ajj] € Mxxn(F). 


Example Let us check if the subset 7 3 | of Q* is linearly 


a 
RD 


3 
4 


coooco 


-1 

1 

0 

4 
3 
dependent, and to do so we need to consider the matrix A = - : = 
4 


This matrix is row equivalent to the matrix in reduced row ech- 


ocooor 
cooroeo 
eo oo © 


elon form. Therefore, the set of solutions to the homogeneous system AX = 


—2z —2 
is z |)z€Q} so that if we pick one such nonzero element, say 1 |, we 
z 


+ 


see that (—2) , Showing that the set is indeed 


6 
4 4 
linearly dependent. 


oooco 


1 
1 
0 
4 


We note that if A € Mx x(F) then the number of arithmetic operations needed 
so solve a system of linear equations of the form AX = w using Gaussian elim- 
ination, is no more than ek(k — 1)Gn — k — 2) if k <n and no more than 
en[3kn + 3(k —n) —n? — 2] otherwise. Of course, if the matrix A is of a spe- 
cial form, this procedure can be much faster. For example, if A € Mnxn(F) is a 
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tridiagonal matrix, then a system of equations of the form AX = w can be solved 
using 3n additions/subtractions and 5n multiplications. 

If A € Mgxn(F) is a nonsingular matrix which can be written in the form LU, 
where L is lower triangular and U is upper triangular, then a system of linear equa- 
tions of the form UX = w is easy to solve using Gaussian elimination, since U 
is already in row-echelon form. Moreover, since U must also be nonsingular, this 
system has a unique solution y = U~!w. Then the system AX = w has a unique so- 
lution, which is also the solution to the system LX = y and that system too is easy 
to solve. We therefore see the importance of the LU-decomposition of matrices, 
assuming that one exists. 

Given a matrix A € Mxxn(F), we define the column space of A to be the sub- 
space of F* generated by the set of all columns of A. The dimension of the column 
space of A is called the rank of A. Moreover, there exists a linear transformation 
a: FF" —> Fk satisfying the condition that ®gp(a) = A, where B and D be the 
canonical bases of F” and F*, respectively, and it is clear that the column space 
of A is just im(@). Similarly, we define the row space of A to be the subspace of 
Mi xn(F) generated by the rows of A. We will show that the dimension of this 
space is also equal to the rank of A. 


Proposition 10.3 Let F be a field, let k and n be positive integers, and let 
by 
A€ Mkxn(F) andletw=| + | € F¥*. Then the system of linear equations 


Dx 
AX =w has a solution if and only if w belongs to the column space of A. 


d, 
Proof Ifv= | : | isasolution of the system AX = w then 
dn 
a1 154) . aij 
se ee 


and so w is a linear combination of the columns of A. Conversely, if we assume that 
aj d, 


there exist scalars d},...,d, in F such that w = vin dj} : |,thenv= 


kj dy 


is a solution of the given system. 


In particular, we get the following consequence of this result. 
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Proposition 10.4 Let F be a field, let k and n be positive integers, and let 
by 

A € Mexn(F) and letw=| : | € F*. Then the system of linear equations 
bx 

AX =w has a solution if and only if the rank of the coefficient matrix A is 

equal to the rank of the extended coefficient matrix. 


Now let us return to the problem of identifying the solution sets of homogeneous 
systems of linear equations. 


Proposition 10.5 Let F be a field, let k and n be positive integers, and let 
A € Mkxn(F) be a matrix the columns of which are vectors y\,..., Yn in Fe, 
Assume these columns are arranged such that {y,,..., y-} is a basis for the 
column space of A, for some r <n. Moreover, for all r <h <n, let us select 
scalars by, ..., byn such that: 
(1) yn =bniyi +--+ + Darr; 
(2) ban =—1; 
(3) bnj = 9 otherwise. 
Dnt 
For eachr <h <n, let vz = : € F”. Then {v;-+41,..-, Un} is a ba- 
Dan 
sis for the solution space of the homogeneous system of linear equations 
0 
AX= 


(Comment before the proof: Since {y,..., Yn} is a set of generators for the col- 
umn space of A, it contains a subset that is a basis. The assumption that this is 
{y1,-.--, yr} is for notational convenience only.) 


Proof \fr =n then the solution space of the system of linear equations is : 

0 

and so the result is immediate. Hence let us assume that r < n. If r < h <n, then 
0 


Avy = et bnjyj — Yh = | : |, and so each vp belongs to the solution space 


0 


of AX = | : |. Moreover, the set {v,;+1,..., Un} is linearly independent, since if 
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0 
y cjvj =| : | then for each r < h <n we note that the Ath entry on the 


0 
left-hand side is —cy, whereas the corresponding entry on the right-hand side is 0, 
proving that cy = 0 forallr <h<n. 
We are therefore left to show that {v,+1,..., v,} is a generating set for the so- 
dy 
lution space of the given homogeneous system. And, indeed, let w=] : | be 
dy, 
e] 
a vector in this solution space. Then w + ss 4141p = | : |, where e-41) = 
en 
+++ =e, = 0. Therefore, this vector belongs to solution space of the system, and 
0 
so a €nYh = | : |. However, since the set {y1,..., y-} is linearly independent, 
0 
this implies that e; = --- = e, = 0 as well. Therefore, w = — a ere djvp, Show- 
ing that {v,41,..., U,} is a generating set for the solution space, as required. 


As an immediate consequence of Proposition 10.5, we obtain the following re- 
sult. 


Proposition 10.6 Let F be a field, let k and n be positive integers, and let 


A € Mxxn(F). Then the dimension of the solution space of the homogeneous 
0 


system of linear equations AX = | : | isn —r, where r is the rank of the 


coefficient matrix A. 


We are now ready to prove the characterization of rank which we mentioned 
before. 


Proposition 10.7 Let F be a field, let k and n be positive integers, and let 
A € Mgyn(F). Then the rank of A equals the dimension of the row space 
of A. 


Proof Let v1,..., vg be the rows of A, which generate a subspace of Mj xn(F). 
We can reorder these rows in such a way that {v1,..., v;} is a basis for the row 
space, for some | <t <k. This, as we know, does not change the solution space 
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0 
of the homogeneous system of linear equations AX = | : | and hence does not 


0 
change the rank ra of A. Let B € M;,,(F) be the matrix obtained from A by 
deleting rows tf + 1,...,k. The columns of B belong to F’ and so the rank rg of B 


satisfies rg <t, which implies that n — t <n —rg. But we have already seen that 
0 0 


the homogeneous systems of linear equations AX = | : | and BX =| : | have 


0 0 
the same solution space and so, by Proposition 10.6, n — t <n — rg. From this we 
conclude that r4 < t. We have thus shown that the rank of any matrix is less than 
or equal to the dimension of its row space. In particular, this is also true for A’. 
But the rank of A’ is t, while the dimension of its row space is rg, and so we have 
t <rg as well, proving equality. 


Example Let us find a basis for the solution space of the system of linear equations 
XxX 


F 2. =) | X2 — H over IR. We know that the coefficient matrix is 


1 1 1 |] X3 
X4 
: ‘ 0 5 : 
row-equivalent to the matrix 01 -4 9| reduced row echelon form, and 
this matrix has rank 2. Therefore, the solution space of the system has dimension 
=) -1 
Be 4 0 : : : 
4 —2=2. Indeed, it is easy to check that ile 0 is a basis for this 
0 1 


solution space. 


Gaussian elimination requires an order of magnitude of n° arithmetic operations 
to solve a system of n linear equations in n unknowns. This computational overhead 
is quite significant if n is large (say, over 10,000), even with the use of supercomput- 
ers. As a result, there is considerable continuing research into finding faster methods 
of computation, especially in those cases in which we have additional information 
on the structure of the matrix of coefficients, originating in knowledge of the par- 
ticular problem from which the system arose. Often this structural information is 
immediately noticeable, but sometimes it appears only after a sophisticated consid- 
eration of the problem. 


Example It is often possible to show that the matrix we are interested in, while not 
itself having a special structure, is equal to the product of two matrices having a spe- 
cial structure, a situation which arises in many mathematical models. Let us consider 
one such case. Ann x n symmetric Toeplitz matrix is a matrix B = [bij] € Mnxn(R) 
satisfying the condition that there exist real numbers co, ..., C,—1 such that bj; = cy 
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ONFE 
NRW 
eNO 
NON 


whenever |i — j| = h. Thus, for example, the matrix is asymmetric 


702 1 

Toeplitz matrix. Clearly, the set of all symmetric Toeplitz matrices is a subspace of 
Mnaxn(R). However, it is not a subalgebra, since the product of two such matrices 
need not be a symmetric Toeplitz matrix. They are also convenient to store in a com- 
puter, since we need to keep in memory only the n scalars co, ..., C»—1. Note that 
symmetric Toeplitz matrices are symmetric with respect to both main diagonals. 

Many mathematical models in economics are built around solving systems of lin- 
ear equations of the form AX = w, where A is a product of two symmetric Toeplitz 
matrices—a fact which emerges from a knowledge of economic theory. 


if 

The proper use of mathematical techniques, and especially computational tech- 
niques, also depends very much on a deep understanding of the particular problem 
one is dealing with. Also, it is crucial to emphasize once again that any method 
we use to solve a system of linear equations on a computer will induce errors as a 
result of roundoff and truncation in our computations. With some methods—such 
as Gaussian elimination—these errors tend to accumulate, whereas with others they 
often cancel each other out, within certain limits. It is therefore necessary, espe- 
cially when we are dealing with large matrices, to have on hand several methods 
of handling such systems of equations and to be able to keep track of the way in 
which errors can propagate in each of the different methods at one’s disposal. The 


matter of the numerical stability of solutions to such systems was investigated by 
Wilkinson, among many others. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Otto Toeplitz was a twentieth-century German mathematician who 
studied endomorphisms of infinite-dimensional vector spaces. 


© Sergei Vostok (Faddeev); © Dr. Vera Simonova (Fad- 
deeva). 

The problem computing solutions of systems of 
linear equations was the subject of considerable 
research in the early days of computers. Among 
the contributors were the Russian husband-and- 
wife team of Dimitri Konstantinovich Faddeev 
and Vera Nikolaevna Faddeeva. 


The following is a useful trick which we will need later. 
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Proposition 10.8 Let F be a subfield of a field K. Let k and n be positive 
integers and let A € Mxxn(F). Suppose that there exists a nonzero vector 
xX] 0 


x=|: eK ” satisfying Ax = : |. Then there exists a nonzero vector 


Xn 0 


y € F" satisfying Ay = 


Proof Let V = F{x1,...,xn}, which is a subspace of K, considered as 
a vector space over F. Let E = {vj,...,up} be a basis for V over F and set 
’ 
v=] : | € K?. Then there exists a nonzero matrix B ¢ M,x p(F) satisfying 
Up 
0 
Bu=x andso ABu= Ax =| : |. But E is linearly independent and so we must 


0 
have AB = O. Now take y to be any nonzero column of B. 


We now turn to iterative methods of solution of systems of linear equations. For 
simplicity, we will assume that our field of scalars is always R. The basic idea is, 
as we have already noted, to guess a possible solution and then use this initial guess 
to compute a sequence of further approximations to the solution which, hopefully, 
will converge (in some topology) with relative rapidity. Usually, the initial guess 
is based on knowledge of the real-life problem which gave rise to the system of 
equations, something that can often be done with good accuracy. In very large and 
computationally-difficult situations (for example, weather prediction, chip design, 
large-scale economic models, computational acoustics, or the modeling the chem- 
istry of polymer chains), one can even use Monte Carlo methods, based on statistical 
sampling and estimation techniques, to come up with an initial guess or even an ap- 
proximate solution. 

To illustrate this approach, let us consider the problem of solving a system of 
linear equations of the form AX = w, where A = [ajj] € Mnxn(R) is a nonsin- 

by 
gular matrix and w= | : | € IR". We know that this system has a unique so- 
bn 
lution, namely A~!w, but inverting the matrix A may be computationally time- 
consuming and prone to error, so we are looking for another method. Suppose that 
we can write A = E — D, where E is some matrix which is easy to invert. Then if 
v € R" satisfies Av = w, we know that Eu = Dv + w and so v= E~!(Du+ w). 
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We now guess a value for v, call it v. Then, using this formula, we can define 
new vectors uv“), v), ... iteratively by setting v' = E~!(Dv"— + w) for each 
h > 0. This can be done relatively quickly since, by assumption, E~! was relatively 
easy to compute and, having computed it once for the first step of the iteration, 
we don’t need to recompute it for subsequent steps. Our hope is that the sequence 
vy, v,... will in fact converge. Indeed, if this sequence does converge to 
some vector v then it is easy to verify that v must be the unique solution of AX = w. 

For example, let us assume that the diagonal entries a;; of A are all nonzero, 
and let us choose E to be the diagonal matrix having these entries on the diag- 
onal. Then E~! is also a diagonal matrix having the entries aj; ' on the diago- 


A 
nal. If our initial guess is v© = : |, then it is easy to see that for h > 0 we 
mY 
oft 
have v) = : |, where enee = a;,' [bi = ix ajc] for all 1 <i <n. 
af 


This method is known as the Jacobi iteration method. Another possibility, again 
under the assumption that the diagonal entries a;; of A are all nonzero, is to choose 
E to be the upper-triangular matrix [e;;] defined by setting ej; = ajj if i < j. 


oO ol 

Given an initial guess yO = : |, we see that yO) = : for h > 0, where 
0 h 
of) cl? 


A+] 1 i-1 A+1 h : ‘ 
cj — a;, [bi- SS due: = pare, aijc\ °] for alll < i<n. This method 
is known as the Gauss—Seidel iteration method, since it was discovered indepen- 


dently by Gauss and by Jacobi’s student Philipp Ludwig von Seidel. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Carl Gustav Jacob Jacobi was a nineteenth-century German math- 
ematician, who worked mostly in analysis and applied mathematics. 
His work in astronomy led him to solve large systems of linear equa- 
tions, and his papers on determinants helped make them well-known. 


In both of the above methods, and in other iteration methods (and there are many 
of these), there is no guarantee that the sequence of approximations will always 
converge or that, even if it does converge, it will do so rapidly. Understanding the 
conditions for convergence and analyzing the speed of convergence requires so- 
phisticated techniques in numerical analysis, and indeed there are many examples 
of matrices for which one iteration scheme converges whereas another doesn’t, as 
well as various necessary and sufficient conditions for a given iteration method to 
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converge. For example, a sufficient condition for the Jacobi iteration method to con- 
verge for a matrix A = [a;;] is that, ixi |aij| < |aji| for all 1 <i <n. It is also 
known that if the matrix A is tridiagonal, then the Jacobi method converges if and 
only if the Gauss—Seidel method converges, but the latter always converges faster. 


The convergence and accuracy of the Gauss-Seidel iteration method 
was studied in detail by the Russian mathematician and engineer 
Alexander Ivanovich Nekrasov at the beginning of the twentieth cen- 
tury, long before the use of electronic computers. 


Example Let A= | — 


ort 


2 1 7 
1 2] € M3 3(R) and let w = | 2 |. The system of 
1 3 4 


1 
linear equations Ax = w has a unique solution | | |. If we use the Jacobi iteration 
1 


0 
method beginning with the initial guess v = | 0 |, we get the sequence of vectors 
0 
(written to six-digit accuracy): 
0 1.75000 0.41667 1.04167 0.96528 
oO], 2.00000 | , 1.08333 |, 1.08333 |, 1.09722 |, 
0 1.33333 0.66667 0.97222 0.97222 
0.95833 0.99768 0.99016 0.99614 
1.02083 | , 1.02314 | , 1.01157 |, 1.00559 | , 
0.96759 1.01157 0.99228 0.99614 
0.99816 0.99853 
1.00386 | , 1.00190 |, 
0.99814 0.99871 


and if we use the Gauss-Seidel iteration method with the same initial guess, we get 
the sequence of vectors (written to six-digit accuracy): 


0 1.75000 1.04167 1.06944 1.01504 
0], —0.66667 |, 0.63889 | , 0.80093 | , 0.93672 |, 
0 1.33333 1.55556 1.12037 1.06636 
1.00829 1.00264 1.00113 1.00040 

0.97287 | , 0.99020 | , 0.99611 |, 0.99853 | , 


1.02109 1.00904 1.00326 1.00129 


10 Systems of Linear Equations 207 


1.00016 1.00006 
0.99943 |, 0.99978 | , 
1.00049 1.00019 


so we see that both methods converge, albeit quite differently. 


1 2 0 0 
Example Let A=|2 1 2 | €.M3x3(R) and let w= | —1 |. The system of 
0 2 1 3 
—2 
linear equations Ax = w has a unique solution 1 |. If we try to solve this system 
0 
using the Gauss—Seidel method with the initial guess | 0 |, we get the sequence of 
0 
0 2 30 254 
vectors | —1 |], } —15 |, | —127 |, | —1023 |,... which clearly diverges. 
5 33 257 2049 


A more sophisticated iteration technique is, at each stage, not to replace v"™? 
by the computed v“*+! but rather by a linear combination of the form rv“+) + 
(1 —r)v, where r € R is a relaxation parameter. Doing this with Jacobi iteration 
gives us the Jacobi overrelaxation (JOR) method, and doing it with the Gauss— 
Seidel method gives us the successive overrelaxation (SOR) method. The relaxation 
parameter r is chosen on the basis of certain properties of the matrix A. By choos- 
ing this parameter wisely, one can often achieve a considerable improvement in 
convergence. For the JOR method, one normally chooses 0 <r < 1. In 1958, Ka- 
han showed that the SOR method does not converge for r outside the open interval 
(0, 2). 


© Neville Miles, Imperial College 
London (Southwell); With kind 
permission of the Archives of the 
Mathematisches Forschungsinsti- 
tut Oberwolfach (Kahan, Young). 
Relaxation methods were 
first developed by the 
twentieth-century British 
mathematician Richard V. 
Southwell. Contemporary Canadian mathematician William Kahan has made major con- 
tributions to numerical analysis and matrix computation. The optimal relaxation parameters 
for the SOR method were calculated by the twentieth-century American mathematician 
David M. Young, Jr. 


As a rule of thumb, iteration methods work best for large sparse matrices, such 
as those arising from the solution of systems of partial differential equations. As 
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previously remarked, in iteration methods truncation and roundoff errors tend to 
cancel each other out, rather than accumulate. While sparse matrices arise in many 
applications—as circuit simulation, analyses of chemical processes, and magnetic- 
field computation—there are also important situations, such as the matrices arising 
in radial-basis function interpolation, a technique of great important in computer 
graphics, which lead to very large matrices almost all entries of which are nonzero. 

The Jacobi, Gauss-Seidel, JOR, and SOR methods are examples of iteration 
methods of the form v“@+) = a(v), where q is an affine transformation of R” 
that does not depend on h. Such methods are known as stationary iteration meth- 
ods. In a later chapter, we shall also mention some iteration methods which are not 
stationary. 


Example In the beginning of this chapter, we saw an example of how a nonlinear 
system of equations can be turned into a linear system. This can often be done 
in more general cases, producing large systems of linear equations of the form 
AX = w, where the matrix A is usually sparse and for which iteration methods are 
therefore appropriate. Consider, for example, the problem of finding real numbers 
a, b, and c such that the following conditions hold: 


e-b+ca= 6, 
ab+ac+4bc = 29, 
a’ + 2ab — 2bc = —7, 
9a" —3ab+c =5, 
b’ —c* +5ab=5, 
2ac — 3b” = -6. 
To linearize this, we begin by assigning variables to all of the terms appearing in the 


equations: X; = a*, X> =b*, X3 =c?, X4=ab, Xs5 =ac, and X6 = bc. This then 
yields the system of linear equations 


1-1 1.00 0 6 
0 0 0 11 4 29 
h 0. @ 2.0 22). _=7 
2 fe io ow 5 
0 1-1 50 0 5 
0-3 0 02 0 —6 


which has a unique solution X; = 1, X2 = 4, X3 = 9, X4 = 2, X5 =3, and X65 = 6, 
from which we deduce that a= 1, b= 2, andc=3. 


The iterative methods we have discussed so far are all linear, in the sense that 
they involve only methods of linear algebra. There are, however, also families of 
nonlinear iterative methods, involving the calculus of functions of several variables, 
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of which one should be aware. These include gradient (steepest-descent) methods 
and conjugate-direction methods. A discussion of these methods is beyond the scope 
of this book. 

Finally, another important warning. When we attempt to solve systems of linear 
equations on a computer, it is important to remember that the system may be very 
sensitive, and small changes in the entries of the coefficient matrix may lead to 
large changes in the solution. Such systems are said to be ill-conditioned. Applied 
mathematicians and others who design mathematical models often take considerable 
pains to avoid creating ill-conditioned systems. 


7 7 8 10 
5 5 6 7 . ar : 
Example Let A = 6 9 10 8 € Ma4x4(R). This matrix is nonsingular, 
5 10 9 7 
41 68 —-17 10 32 
a: —6 10-3 2 23 
with inverse equal to 10 —17 5 3 |: Let w = 33 | Then the sys- 
25. —41 10 -6 31 
1 
tem of equations AX = w has a unique solution . However, we also note that 
1 
—7.2 32.10 
A —0.1 | | 22.90 
2.9 | | 32.90 
6.0 31.10 
Example Consider the system of linear equations 
1 -10 0 0 0 0 -—9 
0 1 —-10 0 0 0 -—9 
0 0 1 —-10 0 0 x= -—9 
0 0 0) 1 —-10 0 ~ | —9 
0 0 0) 0 1 —10 -—9 
0) 0 0) 0 0) 1 1 


over R. This system has a unique solution, namely . However, if we alter the 


ee a ey 
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coefficient matrix by changing the (6, 6)-entry to fier (which is roughly equal to 
101 
11 


0.9990009), we will obtain a completely different solution, namely 


Since real-life computations are based, as a rule, on numbers gathered through 
some sort of a measurement process, which is, as a matter of fact, not completely 
accurate and certainly beyond our control, it is extremely important to know how 
sensitive the system is to possible small variations in the values of the entries. The 
numerical analysis of matrices deals extensively with this issue, and here we can 
only present a simplistic measure of this sensitivity for nonsingular square matrices 
over IR. To any matrix A = [a;;] € Mnxn(R), we will assign the number 0(A) de- 
fined by 0(A) = max} <j<n{)_j=1 |a;;|}. The number 6(A)0(A7!) is the condition 
number of the matrix A. Note that A has the same condition number as A~! and as 
cA, for any OAc ER. 


With kind permission of the American Mathematical Society. 


Condition numbers were introduced by John von Neumann, one of 
the great mathematical geniuses of the twentieth century, who con- 
tributed to practically all branches of mathematics—pure and applied. 
Von Neumann was a major force in the introduction of digital comput- 
ers after World War II and the development of numerical methods for 
them. 


The condition number can be written in the form g x 10’. where 0.1 < g < 1. If 
t > 0 then, as a rule of thumb, one can expect that the solution of a system of linear 
equations AX = w will have f significant digits fewer than that of the entries of A. 
Thus, if A is the matrix in the previous example, then 6(A) = 11. Moreover, 


1 10 100 1000 10000 100000 
0 1 10 100 1000 = 10000 
Ata 0 0 1 10 100 1000 
0 0 0 1 10 100 
0 0 0 0 1 10 
0 0 0 0 0 1 


and so 0@(A~!) = 111,111. Therefore 6(A)@(A~!) is roughly 12 x 10’, and so we 
cannot, as we have seen, expect any accuracy in our solution, if we assume our data 
is only good to 6-digit accuracy. 

888445 887112 
887112 885871 


countered, has condition number roughly equal to 0.39 x 10°. 


Similarly, Nievergelt’s matrix i which we have already en- 
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Of course, computing the condition number of a given matrix may also be a prob- 
lem, since it involves calculating A~!. Fortunately, there are many fairly-efficient 
condition number estimators, algorithms that give a good estimate of the condition 
number of a matrix with relatively low computational overhead. 

Various techniques, going under the collective name of preconditioning tech- 
niques, are also often used to increase the speed of convergence and accuracy of 
various iterative methods. A discussion of these can be found in any advanced book 
on numerical matrix computation. 


Exercises 

Exercise 579 
3 4 1 1 0 1 

Are the matrices | —2 —4 -—6] and]0O 1. 1 | in. M3,3(R) row equiva- 

5 2 7 0 0 0 

lent? 

Exercise 580 
123 4 

Bring the matrix} 1 2 4 3 | €.M3,4(R) to reduced row echelon form. 
23 1 4 


Exercise 581 


Let F = GF(5). Bring the matrix € M3x4(F) to reduced row 


eS NOR 
NwWh 
hee 
oro 


echelon form. 


Exercise 582 
Solve the system of linear equations 


(3 —i)X,+ 2—i1)X2+ (44 21) X3 =2+4 6, 
(4+ 31)X; —(5+1)X2+(14+1)X3 =242i, 
(2—31)X,;+ (1 —1)X2+ (2+ 4i)X3 =Si 
over C. 


Exercise 583 
Solve the system of linear equations 


X,+2X24+4X3=31, 
5X, + X2+2X3 = 29, 
3X, — X2+ X3=10 


over R. 
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Exercise 584 
Solve the system of linear equations 


3X1 +4X2+ 10X3 = 1, 
2X, +2X2+2xX3=0, 
X,+X2+5X3=1 
over GF(11). 


Exercise 585 
Find all solutions to the system 


123 4 XxX] 5 
2 12 3 Xo] 1 
3 2 1 2 xX3 | 1 
43 2 1 X4 —5 
over R. 
Exercise 586 
Find all solutions to the system 
123 4 5 XxX 13 
2123 4 Xo 10 
22 12 3 X3|/=] ll 
222 1 2 X4 6 
2222 1 X5 3 
over R. 
Exercise 587 
Find all solutions to the system 
11 1 = 1 
11 0 es =|0 
3 
0 0 1 Xa 1 
over GF(2). 
Exercise 588 
Find all solutions to the system 
1 3 2 0 
2 <1 3 = _|0 
3-5 4 7. =H) 
1 17 4 : 0 


over R. 
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Exercise 589 
Find a real number a so that the system of linear equations 


2X1 —X2+X3+X4=1, 
X,+2X2 — X34+4X4=2, 
X,+7X2—-—4X34+11X4=a 

has a solution over R. 


Exercise 590 
Find all real numbers c such that the system of equations 


X,+X2—-X3=1, 
X,+cX%2+3X3=2, 
2X, +3X2+cX3=3 


has a unique solution over R; find those real numbers c for which it has infinitely- 
many solutions over R; find those real numbers c for which it has no solution 
over R. 


Exercise 591 
Solve the system of linear equations 


X,+2X24+ X3=1, 
X,+X2+X3=0 
over GF(3). 


Exercise 592 
Solve the system of linear equations 


X + (V2)X2 + (V2)X3 = 3, 
X, + (14+ ¥2)X2+ X3=34 V2, 
X1 + X2—(V2)X3=44+ V2 
over Q(/2). 


Exercise 593 
Solve the system of linear equations 


4X, =3Xo=3, 
28, = X94 2X5 = 1, 
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3X, +2X3=4 
over GF(5). 


Exercise 594 
Solve the system of linear equations 


4X, +6X2+2X3=8, 


X, —aX2—2X3=—5, 
7X, +3X2+ (a —5)X3=7 


over R, for various values of the real number a. 


Exercise 595 

For a given real number a, solve the system 
aX, + X2+X3=1, 
X,;+aX2+ X3=1, 
X1+X2+ax3=1 


over R. 


Exercise 596 
For a given a € R, does the system of linear equations 
aX,;+ X2+2X3 =0, 
X, —X2+axX3=1, 
X, + X24+ X3=1 


have a unique solution in R? 


Exercise 597 
Let a be an element of a field F. Find the set of all solutions to the system of 
linear equations 
X,+X2+axX3=a, 
X,+aX2—X3=1, 
X,+X2-X3=1 


over F. 
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Exercise 598 
For which a € Q does the system of linear equations 


X,+3X2 —2X3=2, 
3X, +9X7 —2X3=2, 
2X, +6X2+X3=a 


have a unique solution in Q? 


Exercise 599 
Find real numbers a, b, c, and d such that the points (1, 2), (—1, 6), (—2, 38), 
and (2, 6) all lie on the curve y = ax* + bx? + cx? +d in the Euclidean plane. 


Exercise 600 
Find a polynomial p(X) = a2X* + a,X + ao € R[X] satisfying p(1) = —1, 
p(-1) =9, and p(2) = —3. 


Exercise 601 
Find a polynomial p(X) = a3X?+a2X*+a,X +ao € R[X] satisfying p(0) = 2, 
p(2) =6, p(4) = 3, and p(6) = —S. 


Exercise 602 
Let F = GF(13). Find a homogeneous system of linear equations over F satis- 
fying the condition that its solution space equals 


2 8 7 
1 3 6 
F 9/,)/ 10],}] 2 
7 5 11 
4 12 7 


Exercise 603 


Let F be a field. Let b€ F and let A—|“!! 4%! | € Moy3(F) be a 
a21 a22 a3 


matrix satisfying the condition that the sum of the entries in each row and each 
column of A equals b. Show that b = 0. 


Exercise 604 
Let p(X) = X° — 7X? + 12 € Q(X]. Find a polynomial g(X) € Q[X] of degree 
at most 3 satisfying p(a) = q(a) for all a € {0, 1, 2, 3}. 
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Exercise 605 
Find the rank of the matrix 


1 -1 2 3 4 
1 -l 2 0 
-1 2 1 1 3 | €Msx5(R). 
1 5 -8 -—5 —-12 
3-7 8 9 13 
Exercise 606 
Find the rank of the matrix 
1 0 1 0 0 
1 1 0 0 0 
Oo 1 1 0 0} €Msx5(R). 
0 0 1 -1 O 
0 1 0 1 1 
Exercise 607 
1 1 0 
Let F = GF(2). Find the rank of the matrix | 0 1 1 | €M3,3(F). 
1 01 
Exercise 608 
Let F = GF(5). Find the rank of the matrix 
1 23 4 a 
4 3 a 12 
2 3a 2 4a 1 
for various values of a € F. 
Exercise 609 
Do there exist a lower-triangular matrix L and an upper-triangular matrix U in 
1 -1 2 
M3x3(Q) satisfying the condition LU =| 2 —-1 3]? 
0 1 8 


Exercise 610 
Let F = GF(5). For which values of a € F do there exist a lower-triangular 
matrix L and an upper-triangular matrix U in. M3,.3(F) satisfying the condition 
1 loa 
that LU=|4 1 0]? 
a14 
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Exercise 611 


4 2 3 4 

: is 2 0 2 2 
Find the LU-decomposition of 34 -4 5 € Max4(R). 

-1 0 2 3 


Exercise 612 

Let A € My xn(R) be a tridiagonal matrix all diagonal entries of which are 
nonzero. Can we write A = LU, where L is a lower-triangular matrix and U 
is an upper-triangular matrix, both of which are also tridiagonal? 


Exercise 613 
Let F be a field and let a, b,c € F. Find the rank of the matrix 


1 1 1 
b+e cta atb|eM3x3(F). 
bc ca ab 


Exercise 614 
Let F be a field and let a, b, c,d € F. Find the rank of the matrix 


a Cc Cc 
d atb c|é€M3 x3(F). 
d d b 
Exercise 615 
3 1 4 
Find the rank of the matrix : 7 ; € M4,.4(R) for various values of 
2 2 3 


the real number a. 


Exercise 616 


a -1 2 1 
Find the rank of | —1 a 5 2) €M3,.4(Q) for various values of the ra- 
10 -6 1 1 


tional number a. 


Exercise 617 
Find the set of all real numbers a such that the rank of the matrix 


a 1 1 
1 —1 | €M3,3(R) 
1 a 


equals 2. 
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Exercise 618 

Let n be a positive integer and, for each A € M,,x,(R), let r(A) be the rank of A. 
Define a relation < on M,,x,(R) by setting B < A if and only if r(A — B) = 
r(A) — r(B). Is this a partial order relation? 


Exercise 619 

Let F be a subfield of a field K. Let k and n be positive integers and let 
A € Mkxn(F) be a matrix having rank r. If we now think of A as an element of 
M«kxn(K), is its rank necessarily still equal to r? 


Exercise 620 
17 17 3 
: 4 4 8 6 fo hed 
Find k € Z such that the rank of 31°14 € Ma,.4(Q) is minimal. 
2k 8 20 2 


Exercise 621 

Let F be a field and let k and n be positive integers. For a matrix A € Mxyn(F) 
having rank h, show that there exist matrices B€ Myxn(F) and C € Mixn(F) 
such that A = BC. 


Exercise 622 

Let k and n be positive integers and let F be a field. For matrices A, B € 
Mekxn(F), show that the rank of A + B is no more than the sum of the ranks 
of A and of B. 


Exercise 623 

Let k and n be positive integers and let F be a field. Let A, B € Mixn(F) be 
matrices satisfying the condition that he row space of A and the row space of B 
are disjoint. Does it follow from this that the rank of A + B equals the sum of the 
rank of A and the rank of B? 


Exercise 624 
Find bases for the row space and column space of the matrix 


1 2 =3) =] 2 
-1 -2 1 1 0 | €M3x5(R). 
1 2 0 2 1 


Exercise 625 
Find matrices P, Q € M3,.3(R) satisfying 


oro 
ooo 


i: 23 il 
P|2 —-2 1/Q=|0 
3 0 4 0 
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Exercise 626 
1 2 0 
Write the rows of the matrix A=}i-—1 2 i | €M3,3(C©) as linear com- 
0) 2 -i 
binations of the rows of A’. 
Exercise 627 
{2-3 a Sy" 
012 3 4 
Calculate|}0 0 1 2 3 €Ms5x5(R). 
000 1 2 
000 0 1 
Exercise 628 


Let k and n be positive integers and let F be a field. Let A = [ | be a matrix 


in Mxxn(F), where B is a nonsingular matrix in. M,,;(F) for some 1 <r < 
min{k, n}. Show that the rank of A equals r if and only of DB~'C = E. 


Exercise 629 
Let F be a field and let a, b, c be distinct elements of F. Furthermore, let d, e, f 
be distinct elements of Ff. What is the rank of the matrix 


1 ad ad 
1 b e be | e€M3y4(F)? 
l1 c f cf 


Exercise 630 

Let k and n be positive integers and let F be a field. Let A € Mxxn(F) and let 
w € F* be such that the system of linear equations AX = w has a nonempty set 
of solutions and that all of these solutions satisfy the condition that the th entry 
in them is some fixed scalar c. What can we deduce about the columns of the 
matrix A? 


Exercise 631 

Let n be a positive integer and let F be a field. Let O #4 A € My xn(F). Show 
that there exists a nonnegative integer k such that the rank of A” equals the rank 
of A* for all h > k. 


Exercise 632 


LetA=]—-1 —1 1 | €M3,3(R). Find the condition number of A. 
1 
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Exercise 633 

Let a be a positive real number. It is necessarily true that the condition number 
a 1 a 

of A= | 0 O —1 | € M3 x3(R) is greater than 2a + 1? 
a —l a 


Exercise 634 
Find a positive real number a for which the condition number of 


€ M33 (R) 


> 

ll 
- Ore 
Se Qe 
= 2 © 


is maximal. 


Exercise 635 
Does there exist a system AX = w of linear equations in n unknowns (for some 
positive integer n) over R having precisely 35 distinct solutions? 


Exercise 636 
Can one find an integer / such that the condition number of the matrix 


1 —1000 1 
1 -—100 0 
1 h 1 


is greater than 10°? 


Determinants 1 1 


Let F be a field and let n be a positive integer. We would like to find a function 
from Mn xn(F) to F which will serve as an oracle of singularity, namely a function 
that will assign a value of 0 to singular matrices and a value other than 0 to nonsin- 
gular matrices. Indeed, let F be a field and let n be a positive integer. A function 
bn: Mnxn(F) > F is a determinant function if and only if it satisfies the following 
conditions: 
CQ) dnZ) = 1; 
(2) 6,(A) = Oif A is a matrix having a row all of the entries of which are 0; 
(3) 6,(Eij A) = —6,(A) for all 1 <i Aj <n, 
(4) bn (Eij;cA) =5n(A) forall 1 <i A j <nandallce F; 
(5) bn(Ej-cA) = bn (A) for all 1 <i<nandallO0¢ceF. 

In particular, we note that foreach 1 <i # j <nandallc € F wehave 6,(E;;) = 
—1=8n (Ej), 8n(Eijzc) = 1 = bn (Ejj..), and bn (Eizc) = ¢ = 8n(EZ,)- 

We have yet to show that such functions exist for all values of n, but certainly 
they exist for a few small ones. 


Example For n = 1, the function 6, : [a] + a is a determinant function. For n = 2, 
ay 


the function 52: 
a21 


ai2 : . : 
a K> d11da22 — a12a2] 1S a determinant function. 
22 


As an immediate consequence of parts (1) and (5) of the definition, we see that 
if A = [aij] € Mnxn(F) is a diagonal matrix and if 6, : Mnxn(F) > F is a deter- 
minant function, then 6,(A) = []}_, aii5n(Z) = [ [fq aii. 

We now want to show that for each positive integer n there exists a determi- 
nant function 6; : Mnxn(F) => F, and indeed that this function is unique. We will 
first establish the uniqueness of these functions and check some of their properties, 
holding off on existence until later in this chapter. 


Proposition 11.1 Let F be a field. For each positive integer n there exists at 
most one determinant function 5, : Mnxn(F) > F. 
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Proof Let us assume that 6, : Myyn(F) > F and ny: Maxn(F) > F are deter- 
minant functions and let 6 = ny, — 5,. Then the function 6 satisfies the following 
conditions: 

(1) BU) =0; 

(2) B(A) = Oif A is a matrix having a row all of the entries of which are 0; 

(3) B(EijA) = —B(A) forall l<i Aj <n; 

(4) B(Ei;-cA) = B(A) forall 1 <i Aj <nandallce F; 

(5) B(E;.-A) = cB(A) forall 1<i<nandallO4¢ceF. 

In particular, if A € Mnxn(F) and E is an elementary matrix, then 6(A) and B(EA) 
are either both equal to 0 or both of them are different from 0. But for any matrix 
A we know that there exist elementary matrices E),..., E; in Mnxn(F) such that 
either E,---E,;A =TJ or E,---E;,A is a matrix having at least one row all of the 
entries of which equal 0. Therefore, 6(A) = 0 for every A € Myxn(F). Thus £ is 
the zero-function, and so 6, = Nn. 


Proposition 11.2 Let F be a field and let 8) : Mnxn(F) > F be a determi- 
nant function. Then 6n(A) 4 0 if and only if A is nonsingular. 


Proof If A is nonsingular, there exist elementary matrices E),..., FE, in Mnxn(F) 
such that E,--.E,A = TI, and so, by the definition of the determinant function, 
bn(A) = cdnT) = c, where 04 c € F, and so 6,(A) 4 0. Now assume that 
6n(A) £0 and that A is singular. Then there exist elementary matrices E),..., E; 
in Mnxn(F) such that E,---E;A is a matrix having at least one row all of the 
entries of which equal 0. But then, for some 0 4 c € F, we have 0 4 6,(A) = 
c6, (FE, --- E;A) = c0 = 0, which is a contradiction, proving that A must be nonsin- 
gular. 


Thus we see that the determinant function, to the extent it exists, is the oracle we 
are seeking. 


Example The subset b we : E wae 


c+di a—bi 
a+bi -c+di 
c+di a—bi 


|| of C? is linearly dependent if and 


only if A= € M2x2(C) is singular. We have already noted 


a a : : : : 
that 52: af | > a41d22 — a)2a2, is a determinant function, and so this hap- 
a2) a22 


pens if and only if 62(A) = a? +b? +c? + d* =0, ie., if and only ifa=b=c= 
d=0. 


Proposition 11.3 Let F be a field and let 8) : Mnxn(F) > F be a determi- 
nant function. If A is a matrix in Myy,(F) having two identical rows then 
bn(A) = 0. 
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Proof Suppose that rows h and k of A are identical. First, assume that the character- 
istic of F is other than 2. Then A = Ej,;(A) and so 6, (A) = 6) (E74 A) = —5 (A), 
which implies that 6,,(A) = 0. If the characteristic of F equals 2 then 6,(A) = 
6n(Eng1A), and Epx.1A is a matrix having a row in which the entries of one row 
are all 0. Therefore, by Proposition 11.2, 6,(A) = 0. 


Proposition 11.4 Let F be a field and let yn : Mnxn(F) > F be a determi- 
nant function. If A, B € Mnxn(F) then 

(1) bn (AB) = 67 (A)bn(B); 

(2) 5n(AB) = 5,(BA). 


Proof (1) By Proposition 9.1, we know that AB is nonsingular if and only 
if both A and B are nonsingular. Therefore, 6,(A) = 0 or 6,(B) = 0 if and 
only if 6,(AB) = 0. If 6,(A) 4 0 ¥ 6,(B) then there exist elementary matrices 
E\,..., E;,G,,...,Gs in Myyxn(F) such that B= E,---E,J and A=G,---GsI 
and so AB = G,---G;E,--- E;I, which implies that 6,(AB) = 6,(A)6,(B) from 
the definition of a determinant function. 

(2) This is an immediate consequence of (1), since 5,(A)én(B) = 6,(B)6n (A) 
in F. 


Proposition 11.5 Let F be a field and let 5) : Mnxn(F) > F be a determi- 
nant function. If A € Myxn(F) is nonsingular then by, (A7!) =6,(A)7!. 


Proof By Proposition 11.4, we see that 6, (A7!)6,(A) = 6,(A7!A) = 5, (1) = 1 
and from this the result follows immediately. 


Proposition 11.6 Let F be a field and let 5y : Mnxn(F) > F be a determi- 
nant function. If A € Myxn(F) then: 

(1) 8(AEjj) = —3n(A) for all 1 <i ¢ j <n; 

(2) bn(AEij.c) = 6n(A) forall 1 <i A j <nandallce F; 

(3) b,(AEj-c) = cb (A) for alll <i<nandall0 Ace F. 


Proof This is a direct consequence of the definition of the determinant function and 
Proposition 11.4(2). 


Proposition 11.7 Let F be a field and let 5) : Mnxn(F) > F be a determi- 
nant function. If A € Mnxn(F) then bn (A) = dn (A’). 
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Proof If A is singular then so is A’, and so 5,(A) = 0 = 6,(A’). If A is non- 
singular then there exist elementary matrices E),..., E; in My xn(F) such that 
E,--EA=l=l' = ATE? . Ef. By our remarks in Chap. 9 concerning the 
transposes of elementary matrices, and by the remarks at the beginning of this 
chapter, we see that 8,(A) = 5,(E1--- E;A) = 5,(A’ E} --- E]) =6,(A") and so 
5n(A) = 8n(A*). 


Of course, at this stage we do not know that determinant functions 45, : 
Mauxn(F) > F even exist for the case n > 2 and so we now have to construct 
them. Let us denote the set of all permutations of the set {1,...,} by S,. We note 
that any z € Sy; is a bijective function from {1,...,} to itself and so there exists 
a function z~! € S, satisfying the condition that 22~! and 2~'z are equal to the 
identity function i + i. We also note that if 2, 2’ € S, then wz’ € Sj. 


Proposition 11.8 [fn is a positive integer then the number of elements of Sp 
equals n}. 


Proof Suppose we wanted to construct an arbitrary element z of S,. There are n 
possibilities for selecting 2(1). Once we have done that, there are n — 1 ways of 
selecting (2), then n — 2 ways of selecting 7(3), etc. Thus, the total number of 
ways in which we can define z is n(n — 1)---l=n!. 


Now let z € S, and let | <i < j <n. The pair (i, 7) is called an inversion with 
respect to z if and only if z(i) > w(/). That is to say, (i, 7) is an inversion with 
respect to z if and only if 


i-j 
m(i)— (J) 
We will denote the number of distinct inversions with respect to 2 by h(zr), and 


define the signum of z to be sgn(z7) = (—1)?@), Thus 


seat 1 if there are an even number of inversions with respect to 7, 
g ~|—1 if there are an odd number of inversions with respect to z. 


It is easy to check that sgn(z) = sgn(s~!) for all z € S,. If sgn(zr) = 1, the permu- 
tation z is even; if sgn(z) = —1, the permutation zr is odd. 


Example Let a € S4 be defined by 1b 3, 2h 4, 3b 2, and 4+ 1. Then if we 
consider all possible pairs (i, 7) with 1 <i < j <4 we get 
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(CAD) (x (i), 7(j)) inversion? 


(1,2) (3,4) no 

(1,3) (3,2) yes 

(1,4) (3,1) yes 

(253) (4,2) yes 

(2,4) (4,1) yes 

(3,4) (2,1) yes 
and so we see that sgn(z) = —1. 


Now let 1 be a positive integer and let (K, e) be an associative and commutative 
unital F-algebra. Let A = [ajj] € Mnxn(K). We then define the function A + |A| 
from Myxn(K) to K by setting 


JA] = © sgn(r)az(1y,1 © dx(2),2 © *** © Ax(n),n- 


TESy 
Note that, by the commutativity of K, if tT = x! then 
Az (1),1 © 47(2),2 © °° * © Az(n),n = 41, 7(1) © 92,7(2) © @ An, tn) 


and so |A| = exes, sgn(T)a1,7(1) @ 42,7(2) © ** + © Gn,r(n). Thus we see immediately 
that |A| =|A7| for every A € Myyn(K). If K =C then, since c +d =t+d and 
cd = 7d, we also see that for A = [aj] we have |A| = |A]. Defining this function 
for an arbitrary commutative and associative unital F'-algebra is important for us, as 
we will need it in the case that K = F[X], where F is a field. 


Example If A = [ajj] € M3x3(K), for an associative and commutative unital 
F-algebra (K, e), then 


|A| = a1 © 22 © 433 + 412 © 23 @ 431 + 413 © AQ] © 432 


— G1] © 423 © 432 — A13 @ 422 © 43] — 412 © a2] © 433. 


Proposition 11.9 Let F be a field, let (K,¢) be an associative and commu- 
tative unital F-algebra, and let A = [aij] © Mnxn(K). Pick 1<h <n and 
write anj = bj + Cnj in K for all 1 < j <n. For all 1 <i <n satisfying 
i#/A, set bj; =cjj =ajj. Set B = [bjj] and C = [¢;;], matrices in Myyn(K). 
Then |A| =|B| +|C|. 
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Proof From the definition of |A|, we have 


|A| = y, sgn(z )at.7(1) @ +++ @ dh zh) @*** © An, x(n) 


TES), 


= > sgn(7r a1 2(1) Sne@ [Dn,x(h) or Ch,x(m)] @ +++ @ dy (n) 


TESn 


= > Sgn (I )a1,2(1) @**+ © Da.z(h) © +++ © An,x(n) 


TES) 


+ a sgn(z )@1 (1) @ °° @ Ch r(h) @*** © An, z(n) 
TESy 


= |B) + Cl, 


as required. 


We are now ready to prove that determinant functions, in fact, always exist. 


Proposition 11.10 For an integer n > 1 and a field F, the function 
Mnxn(F) — F defined by At |A| is a determinant function. 


Proof In order to simplify our notation, we will make the following temporary 
convention: if 2 ¢ S, and if A = [ajj] © Mnxn(F), we will write u(z, A) = 
sgn(z a1 2(1) *** Gn,x(n)- Now let us check the five conditions of a determinant func- 
tion. 

(1) Clearly, u(z, I) equals | if 7 is the identity permutation and 0 otherwise, and 
so |J| = 1. 

(2) Let A be a matrix one of the rows of which has all of its entries equal to 0. 
Since a factor from each row appears in every term u(zr, A), we conclude that all of 
these are equal to 0 and hence |A| = 0. 

(3) Let A be a matrix and let B = £;;A. Let p € S, be the permutation which 
interchanges i and j and leaves all of the other numbers between | and n fixed. 
Then sgn(zp) € sgn(z) for all 2 € S, and so for each z € S, we have —u(z, A) = 
u(zp, A) =u(z, B). This implies that |B| = —|A|. 

(4) Let A be a matrix and let B = Ej;.-A. Then B = [bn;], where by; = an; when 
h# j and 1 <t <n, and where bj; = aj; + caj; for all 1 < t <n. By Proposi- 
tion 11.9, we have |B| = |A| + |C], where C is the matrix all of the rows of which 
except the jth are identical with those of A, and where in the jth row we have 
C jr = caj, for all 1 <t <n. Then |C| =c|D| where D is a matrix in which two 
rows, the ith and the jth, are equal. If the characteristic of F is other than 2, then 
D = E;;D and so, by (3), we get |D| = —|D|, and so we get |C| = c|D| = 0 and 
we have |A| = |B|, which is what we want. Therefore, let us assume that the char- 
acteristic of F equals 2. Let p € S, be the permutation which interchanges i and 
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j and leaves all other numbers between 1 and n fixed. Let H be the set of all 
even permutations in S, and let K be the set of all odd permutations. The func- 
tion from H to K defined by z +> pm is bijective since pz, = pz implies that 
1 = p~'pm, = p~!pm2 = m2. Moreover, since the characteristic of F is 2 and 
since u(z, D) = u(pz, D) for all z € H, we see that u(z, D) + u(pz, D) = 0 for 
all x € H. Therefore, |D| = )°,-[u(az, D) + u(pm, D)] = 0 and this implies, 
again, that |C| = 0 and so |A| = |B. 

(5) It is clear from the definition of |A| and if B = E;.-A then |B| =c|A|. 


Thus, in summary, we see that if F is a field and if n is a positive integer, then 
there exists a unique determinant function My x,(F) > F, namely At |A|. We 
call the scalar |A| the determinant of the matrix A. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Scherk). 

Determinants were first used in the work of 
the seventeenth-century German mathematician, 
philosopher, and diplomat Gottfried von Leibnitz, 
who developed calculus along with Sir Isaac New- 
ton. The common properties of determinants were 
first studied by the nineteenth-century German 
mathematician Heinrich Scherk, and the first systematic analysis of the theory of determi- 
nants was done by the nineteenth-century French mathematician Augustin-Louis Cauchy, 
relying on the work of many mathematicians who preceded him. His work was continued 
by Cayley and Sylvester. The term “determinant” was first used by Gauss in 1801, and was 
popularized by Jacobi. 


Example Let n > 1 be an integer. If cj,...,c, are distinct elements of a field F 
and if A = [ajj] € Mnxn(F) is the Vandermonde matrix defined by a;; = ss —! for 
all 1 <i, j <n, then it is easy to verify that |A| = i<j (cj — cj) #0. This result 
can, in fact, be generalized. Suppose that, for 1 < h <n, we have a polynomial 
Pr(X) = pe by X! € F[X] with bpp #0. Let c},..., Cy be distinct elements of a 
field F and let A = [ajj] € Mnxn(F) be defined by a;; = pj(c;) forall 1 <i, j <n. 
Then |A| = by --- Dan RAG; —cj) #0. 


Example As a consequence of Proposition 11.7, we note that if n > 0 is odd and if 
A €Maxn(F) is a skew-symmetric matrix then |A| = |A?| = |—A| =—|A| and so 
|A| = 0. Therefore, by Proposition 11.2, A is singular. If n is even, then one can use 
the definition of |A| to show that |A| = b? for some b which is a sum of products of 
the a;;. Thus, for example, 


0 a2 443.—~s« 14 
—ay 0 a23,— 4 


%y 
—aj3 —a233 OF a34 


= [412434 — 413424 + 414023 


—a\4 —d274 —a34 0 
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This number b is called the Pfaffian of the matrix A. Pfaffians arise naturally in 
combinatorics, differential geometry, and other areas of mathematics. 


Pfaffians were first defined by Cayley, and named in honor of Johann 
Pfaff, an eighteenth-century German mathematician whose most fa- 
mous doctoral student was Gauss. 


We now give two examples of why it was worthwhile to define | A| for matrices 
A with entries in an associative and commutative unital F-algebra, and not just a 
field. 


Example Let V be a vector space of finite dimension n over a field F and let 
B= {v1,..., vg} be a linearly-independent subset of V. Let yj,..., yg be a list 
of vectors in V. We claim that there are at most finitely-many elements a of F sat- 
isfying the condition that the list vj + ay,,..., ve + ay, is linearly dependent. To 
establish this claim, we will consider determinants of matrices over F[X]. Indeed, 
extend B to a basis D = {v,,..., Un} of V. Then, for each 1 <i <k, we can write 
yi = at cjjwj. For each 1 <i, j <k, define the polynomial p;;(X) € F[X] by 
setting 
ciX+1 ifi=y, 
Pij(X) = cy X otherwise, 

and consider the matrix B = [p;j(X)] € Mxxx(F[X]). Then |B] is a polynomial 
q(X) in F[X], which is not the 0-polynomial since g(0) = 1. Moreover, for any 
a € F, we see that g(a) = 0 whenever the list vj + ay1,..., ug + ayx 1s linearly 
dependent. Since a polynomial can have only finitely-many distinct roots, this can 
happen only for finitely-many values of a. 


Example Let n > | be an integer and let U be an open interval of real numbers. Let 
K be the set of all functions in RY which are differentiable at least n — 1 times. 
Then K is an associative and commutative unital R-algebra which is not entire, let 
alone a field. We will denote the derivative of a function f ¢ K by Df and, ifh > 1, 
we will denote the hth derivative of f by D’ f. Given f),..., fn € K, the function 


fil) f(t) o falt) 

(Dfi)(t) OO): a ORO 
W(fi,---,tnyitrhe ; . 
(DP Fe) (BRIG) 23. (DO FO 


is called the Wronskian of f\,..., fn. One can show that if we have W(/\,..., 
Jn) (t) £0 for some t € U then the subset {f,..., f,} of K is linearly independent 
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over R. The converse is false. To see this, let U be an open interval containing the 
origin, let fj :t > t°, and let fo: tt [f°]. Then {f1, f>} is linearly independent 
over R, but W(f1, f2)(t) =0 for any te U. 


The insight of Josef Wronski, a nineteenth-century Polish mathemati- 
cian living in France, was obscured by his decidedly eccentric philo- 
sophical ideas and style of writing, and was recognized only after his 
death. The notion of a determinant of functions was first used by Ja- 
cobi. 


Example Let n be a positive integer equal to 2 or divisible by 4. A matrix 
A= [aij] € Maxn(C) with |ajj| < 1 for all 1 <i, j <n having maximal possi- 
ble determinant (in absolute value) is known as an Hadamard matrix (though, in 
fact, such matrices were studied by Sylvester, a generation before Hadamard con- 
sidered them). For such a matrix, we have |A| = n”/?, and the entries of A are all 
+1. Indeed, a matrix A is an Hadamard matrix precisely when all of its entries 


—1 1 1 1 1 1 1 1 
1 -l 1 1 1 -1 1 -l 
a \ aeaee 
are +1 and AA* =n I. Thus, 1 1-1 1 and I | -1 -1 
1 1 1 -l 1 -1 -l 1 
are Hadamard matrices. Moreover, for each t > 1, there exists an Hadamard ma- 
trix H, of size 2' x 2', defined recursively by setting H, = e| and H; = 
Ai; Ai; 
for each t > 1. 
ee | 


We also note immediately that if A is an Hadamard matrix so are A? and —A. 
Hadamard matrices have important applications in algebraic coding theory, espe- 
cially in defining the error-correcting Reed—Muller codes. Needless to say, the deter- 
minants of Hadamard matrices get very big very quickly. If A is a 16 x 16 Hadamard 
matrix, then |A| = 4,294,967,296 and If B is a 32 x 32 Hadamard matrix, then 
|B| = 1,208,925,819,614,629, 174,706,176. 


We still are faced with the problem of actually computing the determinant of an 
n X n matrix A, especially when n is large. If we work using the definition, we 
see that we must add n! summands, each of which requires n — 1 multiplications. 
The total number of arithmetic operations need is therefore (n — 1)n!+ (nm! — 1) = 
n(n!) — 1, which is a huge number even if 7 is relatively small. For example, if 
we are using a computer capable of performing a billion arithmetic operations per 
second, it would take us 12,200,000,000 years of nonstop computation to compute 
the determinant of a 25 x 25 matrix, based on the definition. Thus we must find 
better methods of computing determinants, a task which became a high priority for 
many nineteenth-century mathematicians. 
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Example Let A = [ajj] € Mnxn(F) be a matrix in which aj; 4 0. Then Chid, 
Dodgson, and others showed that |A| = ay," |Bl, where B € Miy—1)x(n—1)(F) is 
the matrix obtained from A by erasing the first row and first column and replacing 


each other a;; by Pe eel . Thus, for example, 
il ij 
1 2 1 3 1 4 
123 4 8 7 8 6 8 5 
8 7 6 5) _|}l 2 1 3 1 4 
1 8 2 7} |\1 8 1 2 1 7 
7 O43 1 2 1 3 1 4 
3 6 3 4 5. 
-—9 -18 —27 
=| 6 -l 3 
0 -5 -7 
= — 144. 


This method can, of course, be iterated. The method of evaluating determinants in 
this way is known as the method of condensation. 


© George E. Andrews (Andrews). 


During the nineteenth century, matrix theory 
and the theory of determinants attracted many 
gifted mathematicians and mathematical ama- 
teurs. Felice Chid was a nineteenth-century Ital- 
ian mathematician and physicist. On the other 
hand, Rev. Charles Lutwidge Dodgson was an 
amateur who is better known by his pen name 
Lewis Carroll, the author of Alice in Wonderland. Dodgson published several works on 
mathematics and mathematical logic. In the twentieth century, ingenious ways for com- 
puting determinants of matrices arising from various combinatorial problems have been 
devised by American mathematician George Andrews. 


Let A = [aij] € Mnxn(K), where K is an associative and commutative unital 
F-algebra. For each 1 <i, j <n, we define the minor of the entry a;; of A to be 
|Ajj|, where Ajj € M(—1)x(n—1)(K) is the matrix obtained from A by erasing the 
ith row and the jth column. 


Example If A= 


Nw 
WwW oo Ww 
BOF 


sthen Ars = | and Aza = | 5 ai 
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Proposition 11.11 Let F be a field, let (K, ¢) be an associative and commu- 
tative unital F -algebra. If n is a positive integer, and A = [ajj] € Mnxn(K), 
then |A| = Da) Va; e |A;;| for each 1 <t <n. 


Proof In order to simplify our notation, let det(y1,..., y,) denote the determinant 
of the matrix the rows of which are yj, ..., y,. We will first prove the theorem for the 
case t = 1. That is to say, we must show that |A| equals Vac) Va; {Aj jl. 
For each | < h <n, let vy» € M1 x(K) be the matrix [d, ... d,] defined by 


Lali ifish, 
‘10 otherwise. 


Then the ith row of A can be written as w; = ae ajjv; and so 


n 
|A| = det(w,..., wy) = a4( Soa. we, oot] 


j=l 


n 
= Year e det(v;, W2,..+, Wy). 
j=l 
Thus we will prove the desired result if we can show that 
det(vj, w2,.--, Wn) = (—1)' Airy! 
for each 1 < j <n. Denote the matrix the rows of which are vj, w2,..., Wy, by 
B=[bip], where 
1 ifi=landh=j, 
bin = 4 9 ifi=landhFj, 
aih ifi>1l. 
For 1 <j <n, set Gij = {mw € Sy, | C1) = j}. 
Suppose that j = 1. Then, in particular, there is a bijective correspondence be- 
tween G1, and the set of all permutations of {2,...,} which does not affect the 


signum of the permutation since if 2 € G1; then 1 does not appear in any inversion 
of z. Since bj; = 1 and bj, = 0 if h > 1, we thus have 


|B] = 55 sgn(r)bi wy @ +++ ® Dnxn) 


TESy 


= > sgn(z )b1 7(1) e---@ bn x(n) 


mEG 


= ss sgn( )b2,0(2) @+++ @ Dn,x(n) = |Aril, 


mwEG 1 
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and so we have shown, as desired, that |B] = (—1)!*!|Ay;|. If 7 > 1 put column j 
of B in the position of the first column and shift columns | to j — | of B to the right 
by one column position. This involves j — 1 column interchanges, and we have 


1 0 — 0 
_ , (427 G21 «++ Gn : 
det(v;, w2,..., Wn) = (—1)47! =(-1!*Aajl. 
QGnj Gnil «+++ Ann 


Now assume that t > 1. Again, we can interchange the rth row with the first row 
by t — 1 exchanges with the row above, and we get |A| = (—1)'~!|C|, where C isa 
matrix satisfying |C;;|=|A;;| for each 1 < j <n. Therefore, 


JA] = (-D"C] = F191 OH terj e Cry] = SD ayy © [Ay 


j=l j=l 


as desired. 
1 7 3 0 
401 3 
Example For A= 02 4 9| wesee that 
3 15 1 
0 1 3 4 1 3 4 0 3 4 0 1 
|AJ=1)2 4 O0/-7]0 4 0O)/+3/0 2 O;}-OjO0 2 4 
15 1 3 5 1 3 1 1 3 1 5 
= 16+ 140—-304+0= 126 
and 
7 3 0 1 3 0 1 7 0 1 7 3 
|AJ=0)0 1 3/—2]4 1 3)+4/4 0 3/-0/4 0 1 
15 1 3 5 1 3 1 1 3 1 5 


= 0-24 128-0= 126. 


Even this method of computing determinants is not easy, however, unless there is 
a row (or column) of the matrix a significant number of the entries in which are equal 
to 0. To see the computational overhead of computing the determinant of a general 
n Xn matrix using minors, let us denote the number of arithmetic operations needed 
to do so by py. Clearly pj = 1 and p2 = 3. Suppose that we have already found 
Pn—1. Then, by Proposition 11.11, we see that in order to compute the determinant of 
ann Xn matrix we have to compute the determinants of n matrices of size (n — 1) x 
(n — 1) and then perform n multiplications and n — 1 additions/subtractions. That is 
to say, we obtain the recursive formula 


Pn =Npn-1 +n + (n—1)=npy-1+2n—1, 
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when n > 2. Setting t, = ans we see that 


a 
(n—1)! n! 
and so 
th = [tn — thi) + [m-1 — m2) +--+ +I -hl+h 
=| ale t F- a|+! 
(n—1)! nn! 2! «3! 
=2| a|- lat ota] 
(n—1)! 1! 
ee ee ee ee 
(n—1)! 1! n! 
1 1 
-|5 (n ptt tt l 


and thus we see that py = nll + Gow feet t + 1] — 2. But from calculus we 


know that e, the base of the natural logarithms, has an expansion of the form 


ef 


1 1 
of ae oe 
BSG ae ae 


where 0 <c < 1, and so p, =n![e — wen! —2.Ifn > 2, we see that 


ef e e 


0< < <.-<l 
n+1 n+17 3 


and so we conclude that en! — 3 < p, < en! — 2. Since py, is a positive integer, we 
see that py = Len!| — 2, where [r] denotes the largest whole number less than or 
equal to r, for any real number r. In particular, we see that p, grows even faster than 
exponentially, as a function of n, which is very rapid growth indeed. For example, 
Pio = 9,864,094 and p15 = 3,554,625,081,047. 

Recently, sophisticated numerical techniques have been developed to compute 
the determinants of matrices with entries from a finite field. 

In special cases, it is also possible to find bounds on the value of the determinant 
of a matrix, without actually computing it. For example, we will see below that if 
A € Mnxn(R) and if g is a positive real number greater than or equal to the absolute 
value of each of the entries of A, then the absolute value of |A| is at most g”/n”. In 
1980, American mathematicians Charles R. Johnson and Morris Newman proved a 
surprising bound. Let A = [ajj] € Mnxn(R). For each 1 <i <n, let b; be the sum 
of all positive entries in the ith row of A and let c; be the sum of all negative entries 
in the ith row of A (the sum of an empty list is taken to be 0). The absolute value of 
|A| is then at most []_, max{b;, —ci} — []/_, min{b;, —c;}. 
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Proposition 11.12 Let n be a positive integer, let F be a field, and let 
(K, e) be an associative and commutative unital F-algebra. Let A = |ajj] € 
Mnyxn(K) be a matrix which can be represented in block form as 


By O eas O 
Boy, Bo ... O 
Bni Bm2 --- Bum 


where m > | and each of the Bnn is square. Then |A| = [|], | Baal. 


Proof Let us first consider the case m = 2, and assume that Bj; € M; x;(K) for 
some t <n. We will proceed by induction on ¢. If t = 1, then, by Proposition 11.11, 
|A| = a11|Bo2| = |Bi1| e | Bo2|, and we are done. Now assume that ¢t > | and that 


the result has been established for all matrices of the form Pine | where 


Bo, Bao 
By € Mo@-1)x@—1)(K). Let C; be the matrix obtained from Bj by deleting the 
jth column. Then, by Proposition 11.11 and the induction hypothesis, 


t 
- aap, |Buy @ 
aa 1) aj CB» 


t 
= Yep ta; e (|(Bi)1| © |Bz2l) 
j=l 


t 
= (nia, ° (eo) e | Bo2| = |Bii| ¢ | B22I, 


j=l 


which establishes this case. 
Now assume, inductively, that the result has been established for m and consider 
a matrix A € My x,(K) which can be written in block form as 


By O gous O 
Bo Bo ace O 
Bm+t1,1 Bm+1,2 see Bn+i,m+1 
By O ee O 
Bo Bo... O : : 
If we set C = . . p . then, by the case m = 2 and the induction 
Bn Bn2 tee Binnm 


hypothesis, |A] =|C] © |Bm+iym+il =T[j2y |Bral: 
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Note that, as an immediate consequence of Proposition 11.4, we see that if all of 
the matrices Byp are of the same size, then |A| = |B11--- Bnm|. 

Let A = [aij] € Mnxn(K) for some associative and commutative unital F- 
algebra (K,e). We define the adjoint of A to be the matrix adj(A) = [bij] € 
Mnxn(K), where bj; = (-D' Aji forall 1 <i, j <n. 


103 5 
E eia=|- +? 4 May.4(R) th 
xample = 4212 € May4(R) then 
1 12 5 
—20 9 -I17 25 
adi(a)= 50 —-18 -—16 —40 


—40 —-18 -16 50 
10 9 13.) —35 


Proposition 11.13 Let F be a field and let n be a positive integer. If 
A= [aij] € Mnxn(F) then A[adj(A)] = |A|J. In particular, if the matrix A is 
nonsingular then A~' = |A|~! adj(A). 


Proof Suppose that adj(A) = [bj]. Then Aladj(A)] = [c;j], where cjj = 
re Gikbe; = hy (Ds aig A jel. If i = j, then, by Proposition 11.11, this 
is just |A|. If i 4 7, this is just |A”|, where A’ is a matrix identical to A in all of 
its rows except the ith row, and that is equal to the jth row of A. Thus the matrix 
A’ has two identical rows, and so by Proposition 11.3, |A’| is equal to 0. Hence 
A[adj(A)] = |A|/, from which we also immediately deduce the second statement 
since if A is nonsingular then |A| 4 0. 


In particular, we note that if A is nonsingular then so is adj(A). 


Proposition 11.14 Let F be a field, let (K,e) be an associative and com- 
mutative unital F-algebra, and let n be a positive integer. If A = [ajj] € 
Mnxn(K) is an upper-triangular matrix then |A| = ieee dij. 


Proof We can prove this by induction on n. For the case n = 1, it is immediate. 
Assume therefore that we have already established it for all matrices in My xn(K). 
Then, by Proposition 11.11, |A| = |A?| = °%_)(-D)! aj e [Aji] = aii e /Auil- 
But, by the induction hypothesis, |A11| = [hs ajj;, and we are done. 
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By Proposition 11.14, we see that in general, from a computational point of view, 
it is much faster to first perform elementary operations on a matrix to reduce it 
to upper-triangular form, and then calculate the determinant (making use of the 
fact that, from the definition of a determinant function and from Proposition 11.4, 
we easily know the determinants of the elementary matrices), than to calculate the 
determinant directly. When working in associative and commutative unital algebras 
over a field, or when working with matrices of integers, this presents somewhat 
of a problem since it is not always possible to divide by nonzero scalars in such 
contexts. However, various variants on Gaussian elimination which do not involve 
division have been developed to overcome this. 


© The Daily Northwestern. 


One of the major researchers instrumental in the development of such 
methods was the twentieth-century Swiss/American computer scien- 
tist Erwin Bareiss. 


Combining Propositions 11.4 and 11.14, we see that if A € M,,.,(F) can be 
written in the form LU, where L is a lower-triangular matrix and U is an upper- 
triangular matrix, then |A| is the product of the diagonal elements of L and the 
diagonal elements of U. 


Proposition 11.15 (Cramer’s Theorem) Let F be a field and let n be a 
positive integer. If A = [aij] € Mnxn(F) is a nonsingular matrix and if w = 


by 
: | € F", then the system of linear equations AX = w has the unique 
bn 
dy 
solution v= | : | in which, for each 1 <i <n, we have dj = IAI Awl, 
dn 


where A(;) is the matrix formed from A by replacing the ith column of A by w. 


Proof If Av =w then |A|v = (|AJA7!)Av = adj(A) Av = adj(A)w and so for each 
1 <i <n, we have |Ald; = Dia); |A j;|. But the expression on the right- 
hand side of this equation is just, by Proposition 11.11, |A()|, developed by minors 
on the ith column. 
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Gabriel Cramer was an _ eighteenth-century 
Swiss mathematician and friend of Johann 
Bernoulli (one of the formulators of calculus) who 
was among the first to study determinants and 
their use in solving systems of linear equations. 
Cramer’s rule was also described independently 
by the eighteenth-century Scottish mathematician 
Colin Maclaurin. 


Cramer’s theorem, published in 1750, was the first systematic method for solving 
a system of linear equations, though special cases of it were known to Leibnitz 
75 years earlier. While it is elegant mathematically, it is clearly not computationally 
feasible, even when n is only moderately large, as was immediately realized by 
mathematicians of the time. Indeed, solving a system of linear equations AX = w 
by Cramer’s method, where A is a nonsingular n x n matrix over a field F’, requires 
znt a an? a sn? + an additions and sn4 + sn + $n? + én — | multiplications, 
which is considerably worse than the methods we have previously studied, for which 


the number of arithmetic operations necessary grows as n°, rather than as n’*. 


2 
Example Consider the system of linear equations AX = | 1 |, where A = 
4 
1 -1 1 
1 2 0 |. Then |A| = —5 and 
1 0 -l 
2 -1 1 1 2 1 
IAg@l= ]1 2 O0O;=-13, |Ag|=)1 1 0}=4, and 
4 0 -l 1 4 -1 
1 -l1 2 
|Ag| = ]1 2 =7 
1 0 4 
13 
As a consequence, we see that the unique solution to the equation is i —4 
=] 


We note that if A = [aj;] € Mnxn(F) then the polynomial 


> sgn(w) Xx(1)X2(2) °° Xan) € F[X1,..-, Xn] 


TESy 


is flat and of degree n. This allows us to make an interesting use of Proposition 4.5. 


238 11. Determinants 


Proposition 11.16 Let F be a field of characteristic other than 2, let A = 
Laij] € Mnxn(F) be an arbitrary matrix, and let C = [cij]€ Mnxn(F) be 
a diagonal matrix with nonzero entries on the diagonal. Then there exists a 
diagonal matrix E = [e;j] € Mnxn(F) with diagonal entries +1 such that 
EC + A is nonsingular. 


Proof Let X\,..., Xn be indeterminates over F and let D = [dij] € Maxn(F[X1, 
...,Xn]) be the diagonal matrix with djj = X; for 1 <i <n. Then |DC + A| isa 
flat polynomial in F[X,,..., X;] of degree n, and so the result follows immediately 
from Proposition 4.5. 


Example If A = [aij] © Mnxn(R) and if e > 0 then, by Proposition 11.16, it is 
possible to “tweak” the diagonal of A to obtain a nonsingular matrix [a; jl where 


we ajte ifi=j, 
ij otherwise. 


The sum which appears in the definition of the determinant shows up in other 
contexts related to matrix algebras. An associative algebra (K, e) over a field F sat- 
isfies the standard identity of degree n if and only if ee SEN(T)aq(1) © Az(2) © 

++ @ dz(n) = O for any list aj,...,a, of elements of K. Thus, for example, the 
standard identity of degree 2 is aj e az — a2 ea, = 0. The algebra K satisfies this 
identity precisely when it is commutative. The Amitsur—Levitzki Theorem states that 
for any field F and any positive integer n, the F-algebra My xn(F) satisfies the 
standard identity of degree k for each k > 2n. There are several proofs of this 
result, all beyond the scope of this book. Some of these are based on a gener- 
alization of the Cayley-Hamilton Theorem, which we shall see in the following 
chapter. 


© Alexander Levitzki (Levitzki). 

Yaakov Levitzki and his student Shimshon 
Amitsur were twentieth-century Israeli alge- 
braists. 


We end this chapter by showing how an important construction in analysis 
can be considered in terms of determinants of matrices over the R-algebra R[X]. 
Let cg, cj,... be real numbers and let us consider the analytic function f : x te 
en c;x', which converges for all x in some subset U of R. We know that U 4 @ 
since surely 0 € U. Given positive integers k and n, we want to find polynomials 
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D(X), q(X) € R[X] of degrees at most k and n, respectively, such that the function 
x b> p(x)q(x)7! — f(x) also converges for all x € U and is representable there by 
a power series of the form x > )7%°, dix**"*". If we find such p and q, then the 
function x +> p(x)q(x)7! is called the Padé approximant to f of type k/n. Padé 
approximants are very important tools in differential equations and in approximation 
theory. Hermite made use of Padé approximants in his proof of the transcendence 
of e. 


Henri Padé was a nineteenth-century French en- 
gineer who developed these approximants in the 
course of his work. Interest in them intensified 
in the early twentieth century when the French 
mathematician Emile Borel made extensive use 
of them in his work on analysis. 


x2+4+4x+6 


Example If f : x e* = 72 ae! then the function gj : x +> *¢=5;> is a Padé 
. . 2 + . Zz 
approximant to f of type 2/1 and the function g2 : x te x tort is a Padé approx- 


imant to f of type 2/2. 


If we are given an analytic f as above, how do we calculate Padé approxi- 
mants to it? One way is by using determinants. First of all, define c_; = 0 for 
all positive integers i. Then, given positive integers k and n, define the matrices 
Pxin(X), Qkjn(X) € Mn4tyx (nt (RIX]) by setting: 


Ck-n+1 Ck—n+2 see Ck+1 
Ck—n+2 Ck—n+3 tee Ck+2 
Pkjn (X)= 
Ck Ck+1 oo Ck+n 
k-n_, ynti yok—nt+l .. ynti-1 k yi 
Vino i X 7-0. GiX ee jpg EX 
and 
Ck—n+1 Ck-n+2 +++ Ck+l 
Ck—n+2 Ck-n+3 +++ Ck+2 
Ok/n(X) = : 
Ck Ck+1 ses Cktn 
pa Pee 1 


Then the polynomials p(X) = | Pe/n(X)| and q(X) = |Qx/n(X)| are of the de- 
sired size, and our approximant is given by x + p(x)q(x)7!. 
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Exercise 637 


sin(a) cos(a) cos(a)  sin(a) 
Calculet | cindy scostb) sin(b) cos(b) 
Exercise 638 
1 i i+i 
Calculate} —i 1 0 J;EC 
1-i 1 
Exercise 639 
a—6 0 0 —8 
5 a—4 0 12 
Calculate | 3 dr ne for anyaeR. 
1 
0 5 1 1 
Exercise 640 
Find the image of the function f from R to itself defined by 
1 O -t 
fitre]1l 1 -tl}. 
t 0 -l 
Exercise 641 
For real numbers a, b, c, and d, show that 
a? (a+1)? (a+2)? (a+3)? 
b (b+1? (+2? +3)?! _, 
ce (c+1)? (€+2) (+3) ] 
@ (d+1P d+2)? @+3y 


Exercise 642 


11 


Determinants 


for real numbers a and b. 


Let be a positive integer and let c be a fixed real number. Calculate the deter- 


minant of the matrix A = [ajj] € Mnxn(R) defined by 


Exercise 643 


For a, b ER, calculate 


c ifi<j, 

aj= ji ifi=j, 

0 ifi>j. 

a b a+b 
b a+b a 
a+b a b 
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Exercise 644 
Let F = GF(2). Does there exist a matrix A € M2 x2(F) other than J satisfying 
the condition that |A| = |A7 A| = 1? 


Exercise 645 
If n is a positive integer, we define the nth Hankel matrix Hy, € My xn(R) to be 
the matrix [q;;] satisfying 


—_ fo ifi+j—1>n, 
I" |it+tj—1_ otherwise. 


Calculate | H,,|. 


The nineteenth-century German mathematician Hermann Hankel 
was among the first to recognize and popularize the work of Grass- 
mann. 


Exercise 646 

Let p(X) =a +a, X + aX? and q(X) = bo +b, X + by X* be polynomials in 

C[X]. Show that there exists a complex number c satisfying p(c) = q(c) = 0 if 
ao a, a2 0 


--|O ao a a2\_ 
and only if by ty. be =0. 
0 bo bh bo 


Exercise 647 
Find the set of all pairs (a, b) of real numbers such that 


a+l1 3a = =6b+3a (b4+1 
2b b+1 2-—b 1 

a+2 0 1 a+3 
b-1 1 a+2 a+b 


=0. 


Exercise 648 
0 @=byY Gacy 
For a, b,c € R, show that | (b — a)? 0 (b— c)? >0. 
(=a (6=5b/ 0 
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Exercise 649 
Let n be a positive even integer and let c,d € Q. Let A = [aij] € Mnxn(Q) be 
the matrix defined by 


c ifi=j, 
aj=\d ifit+jon+l, 
0 otherwise. 


Calculate | A]. 


Exercise 650 
Let n be a positive integer and let A = [ajj] € Mnxn(Q) be the tridiagonal matrix 
defined by 


sl oifti-sl<l, 
“ji =)0 otherwise. 


Show that 


-1 ifn =3k, 
IAl=41 ifn=3k+1, 
0 ifn=3k+2 


for some nonnegative integer k. 


Exercise 651 
a+b c Cc 
Find a,b,c € Z for which | a b+ec a_ | is divisible by 8. 
b b atc 


Exercise 652 

Let n be a positive integer and let A € Myxn(Q) be a nonsingular matrix satis- 
fying the condition that all of the entries of A and of A7! are integers. Show that 
|A|=+1. 


Exercise 653 
—2a a+b ate 
For elements a, b, and c of a field F, calculate|a+b —2b b+c\. 
ate b+ec —2c 


Exercise 654 
We know that the integers 23028, 31882, 86469, 6327, and 61902 are all divisible 


23 02 8 
3 1 8 8 2 
by 19. Show that}8 6 4 6 9/| is also divisible by 19. 
063 2 7 
6 19 0 2 
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Exercise 655 
Let g € Q. Show that there are infinitely-many matrices in M3,3(Q) of the 


2 2 3 
form | 3g+2 4q¢+2 5q+3 |, where a <b <c, the determinant of which 
a b c 


equals q. 


Exercise 656 
Let n be a positive integer and let F be a field. Let A € Myxn(F) be a non- 


singular matrix which can be written in block form A = Ain At , where 
Ari Ag2 
; : B B 
An € Mxxx(F) for some integer k <n. Write A~! as ee 2 , where 
By, By 


By € Mxxx(F). Show that |A11| = |A] - | Boa]. 


Exercise 657 


Find all real numbers a for which 


| 
Ke ONMNeR 
NR We 


-1 


Exercise 658 

Let a, a2, ... be a sequence of real numbers. For each positive integer n, define 
the nth continuant cy, of the sequence to be the determinant of the tridiagonal 
matrix An = [aij] € Mnxx(R) given by 


aq ifi=j, 
| -1 ifi=j-1, 
a eal ae ee 


0 otherwise. 


Show that cy = GyCn—1 + Cn—2 for all n > 2. 


Exercise 659 
Let n > 1 be an integer, let d be a real number, and let A = [ajj] € Mnxn(R) be 
the matrix defined as follows: 


0 ifi=j, 


aj=41 ifi>landj=lori=landj>1, 
d_ otherwise. 


Show that |A| = (—1)"’"!(n — 1)d"~?. 
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Exercise 660 
Let bj,...,b, be nonzero real numbers and let A = [ajj] € Mnxn(R) be the 
matrix defined as follows: 


_ jlt+b; ifi=y, 
1 otherwise. 


Calculate | A]. 


Exercise 661 
Let A = [aij] € M4x4(Q) be a matrix each entry of which is either —2 or 3. 
Show that |A| is an integer multiple of 125. 


Exercise 662 

Let a, b, c, and d be real numbers not all of which are equal to 0. Show that the 
a b Cc d 

matrix ee oe =e € M4,x4(R) is nonsingular. 
c —-d —a b 

d c —b -a 


Exercise 663 
Does there exist a rational number a satisfying the condition that the matrix 
loa 0 
a l 1 | €M3,3(Q) is nonsingular? 
-l a -l 


Exercise 664 
Find all matrices J 4 A € M2,.2(R) satisfying A =T. 


Exercise 665 
Find all triples (a, b, c) of real numbers satisfying the condition 


laa 
1 b Bl=(b—c)(c—a\(a—b\(at+tb+to). 
c Peas 


Exercise 666 

Let n be a positive integer, let A = [a;j] € Mnxn(C) and let B = [bj] € 
Mnxn(C) be defined by b;; = aj; for each 1 < i, j <n. Show that |AB| is a 
nonnegative real number. 


Exercise 667 


1 log, a 


Calculate log. b f 


for given positive real numbers a and b. 


a 
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Exercise 668 


Let F be a field. Calculate . for anyae F. 


2a 3a 4 
4a> 3a 2a ss 1 


Exercise 669 

cos(a) _ sin(a) cos(a) sin(a) 
cos(2a) sin(2a) 2cos(2a) 2sin(2a) 
cos(3a) sin(3a) 3cos(3a) 3sin(3a) 
cos(4a) sin(4a) 4cos(4a) 4sin(4a) 


Calculate foraeR. 


Exercise 670 
Let n be a positive integer and let A = [ajj] € Mnxn(R) be the matrix defined 
by 


ee 0 ifi=J, 
“i =~) 1 otherwise. 


Calculate | A]. 


Exercise 671 


3 -1 1 
Let A=] 0 2 4) €M3x3(R). Calculate adj(A). 
1 -1 1 
Exercise 672 
1 0 
Let F = GF(2) andletA=|]0O 1 1 | €M3,x3(F). Calculate adj(A). 
1 1 1 


Exercise 673 
Let F be a field, let n be a positive integer, and let A, BE My xn(F). Is it nec- 
essarily true that adj(A B) = adj(A) adj(B)? 


Exercise 674 
Let F bea field, let n be a positive integer, and let A € M,,x,,(F). Is it necessarily 
true that adj(A’) = adj(A)?? 


Exercise 675 
Let F be a field, let n be a positive integer, and let the matrices A, B € Mynxn(F) 
be nonsingular. Show that adj(B~!AB) = B~! adj(A)B. 


Exercise 676 
Let F be a field, let n be a positive integer, and let A, B € My xn(F) be matrices 
satisfying B 4 O and AB = O. Show that |A| = 0. 
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Exercise 677 


12 3 
LetA=]1 3 4] €M3,3(R). Use the adjoint of A to calculate A~!. 
14 3 


Exercise 678 

Let F be a field, let n be a positive integer, and let A = [ajj] € Mnxn(F). Let 
B= [bij] € Maxn(F) defined by bj; = (—1)' "aij for all 1 <i, 7 <n. Show 
that |A| =| BI. 


Exercise 679 

Let F be a field, let n be a positive integer, and let A = [ajj] € Mnxn(F). Let 
B = [bij] € Mnxn(F) defined by bj; = (—1)'t/*!a;; for all 1 <i, j <n. Show 
that (—1)”|A| = |B]. 


Exercise 680 
Let n be a positive integer and let z € S,. Let A € Myx, (Q) be the permutation 
matrix defined by z. Calculate | A]. 


Exercise 681 
Is the set of all permutation matrices in My xn(Q) closed under multiplication? 
Is the inverse of a permutation matrix a permutation matrix? 


Exercise 682 
Let A = [aij] € M33(R) be a matrix in which aj2 4 0 for all 1 <i < 3. Denote 
the minor of a;; for all 1 <i, j <n by A;;. Show that 


1 


a22 


1 


a32 


Az, Ag3 
A31 A32 


Ai Aj3 
A31 A33 


Ail Aj3 


1 
A| = — 
em Az; A22 


a12 


Exercise 683 
Let F be a field, let n be a positive integer, and let A = [ajj] € Muxn(F) be 
nonsingular. Show that adj(adj(A)) = |A|?-2A. 


Exercise 684 

Let a and b be real numbers and let n be an integer greater than 2. Let D = [dj;] € 
Mnxn(R) be the matrix defined by dj; = sin(ia + jb) for all 1 <i, j <n. Show 
that |D| = 0. 


Exercise 685 
Let F be a field and let a, b,c, d,e, f, g € F. Show that 
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Exercise 686 
Let F be a field, let n be a positive integer, and let A = [ajj] € Mnxn(F). Let 
B= [bij] € Maxn(F) be the matrix defined by 


i aij + dj, j+1 if j <n, 
Wy din otherwise. 


Show that |B| =| A]. 


Exercise 687 
Let k and n be integers greater than 1. Let F be a field and let A = [a;;] be a 
matrix in Mx x»(F), the upper row of which contains at least one nonzero entry. 


For each 2 <i <k and each 2 < j <n, let djj = be . Show that the rank 
il ij 
dyy_... dn 
of the matrix D = ne Paes € Mck—-1)x(n—1) (F) is r — 1, where r is 
dy ... dn 
the rank of A. 
Exercise 688 


Let F be a field, let a 4 b be elements of F, and let A, B € M2,2(F) be matrices 
satisfying the condition that |A + AB| € {a,b} for h = 1,2,3,4,5. Show that 
|A+9B| € {a, b}. 


Exercise 689 


bc 0 
Let F bea field and let a, b,c € F. Make use of the matrix |} a O c | inorder 
0 a b 


b? +c? ab ac 
to calculate the determinant of the matrix ab e+e bc 
ac bc a+b? 


Exercise 690 

Let A € Myyn(C) be a nonsingular matrix, which we will write in the form 
B+iC, where B,C € My x,(R). Show that there is a real number d such that 
the matrix B +dC € M).x»(R) is nonsingular. 


Exercise 691 

Let F be a field and let n be a positive integer. Let A € My xn(F) be a matrix 
having the property that the sum of all even-numbered columns (considered as 
vectors in F”) of A equals the sum of all odd-numbered columns of A. What 
is |A|? 
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Exercise 692 

Let V = R* and let f : V? > R be the function defined as follows: if vj = 
: U1 ay by 1 

é | fori =1,2,3, then f:] v2 |b Jaz bz 1). Show that f(v4, v2, v3) = 
: U3 a3 b3 1 

f (v4, v2, v3) + f(y, v4, 03) + fi, v2, v4) for all vy, v2, v3, v4 € V~. 


Exercise 693 

Let F be a field and let n be a positive integer. Let D = [djj] € Mnxn(F) be 
the matrix defined by dj; = 1 for all 1 <i, j <n. Show that for any matrix 
A € Mnxn(F) precisely one of the following conditions holds: (1) There is a 
unique scalar a € F such that A + aD is singular; (2) A + aD is singular for all 
scalars a € F; (3) A+ aD is nonsingular for all scalars a € F. 


Exercise 694 
Let A, B,C, D € M2 .2(R) and let M be the matrix : D | € Ma,y4(R). If 


all of the “formal determinants” AD— BC, AD—CB, DA— BC,and DA—CB 
are nonsingular, is M necessarily nonsingular? 


Exercise 695 


Let A, B,C, D € M2 .2(R) and let M be the matrix : Hl € Ma y4(R). If 


M is anonsingular matrix, is at least one of the “formal determinants” AD — BC, 
AD —CB, DA — BC, and DA — CB also nonsingular? 


Exercise 696 

Let n > 1 be an integer and let A = [ajj] € Mnxn(Q) be a matrix satisfying the 
condition that each aj;; is either equal to 1 or to —1. Show that |A| is an integer 
multiple of 2”~!. 


Exercise 697 
If a,b,c, d,e, f are nonzero elements of a field F’, show that 


0 @ BP ¢ 0 ad be cf 
a’ 0 f* e?| lad 0 cf be 
eo FO a |" be ef O ad) 
ee 2 < 0 cf be ad 0O 


Exercise 698 

Let n be a positive integer and let cj,...,c, be distinct real numbers tran- 
scendental over Q. For 1 < h <n, let pp(X) = ys aj X' € Q[X] be a 
polynomial of degree h — 1. Let A = [p;(cj)] € Mnxn(R). Show that |A| = 
(a0--+@n—1) |]; <j (ej — ci). 
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Exercise 699 

Let n be a positive integer and let cj,...,c, be distinct real numbers tran- 
scendental over Q. For each 1 <i, j <n, set djj = cl = ce, and let A = 
[dij] € Mnxn(R). Show that |A| equals (cy---cn)~" [i<j — oc) — 
ceiej) I Vjai (C7 — 1). 


Exercise 700 
Let a, b, c,d be elements of a field F’. Solve the equation 


No ea 
a no & 
Fa an 
a Fa Re 


Exercise 701 
Let F bea field. Does there exist a matrix A in M3,.3(F) satisfying the condition 
that the rank of adj(A) equals 2? 


Exercise 702 
Let a, b, and c be nonzero real numbers. Under which conditions does the equa- 
0 a—X b-xX 
tion |-—a— X 0 c — X | =0 have more than one solution? 
—b—X -c-—X 0) 


Exercise 703 
Use determinants to show that there is no matrix A € My4,.4(Q) satisfying the 


1 0 0 0 
0 2 0 0 

ve 4_ 
condition that A* = 0010 
00 0 1 


Exercise 704 
Let A = [aij] € Mnxn(R) be a matrix satisfying the condition |a;;| > ii |aij| 
for all 1 < i <n. Such matrices are called strictly diagonally dominant. Show 
that |A| 4 0. 


Exercise 705 
Let F be a field and let a, b,c € F. Is it true that 


a be O —a b c 0 
ba Oc|]_| 6b -a O ¢|, 
c 0 a b Cc 0 -a b\" 
0 c ba 0 c b -a 


250 11. Determinants 


Exercise 706 
Let F be a field and let n > 2 be an integer. Give an example of a matrix 
A € Mnhxn(F) all of the entries in which are nonzero, satisfying adj(A) = O. 


Exercise 707 
Let F be a field and let n > 2 be an integer. Show that | adj(A)| = |A|"~! for all 
Ae Maxn(F). 


Exercise 708 
Is the function adj: M2x2(R) > M2 x2(R) epic? 


Exercise 709 


s 0 ft 
For real numbers s andt, let A(s,t) =} 1 1 1 | andlet B(s,t) =adj(AGs,72)). 
t 0 1 


Find the set of all real numbers s satisfying the condition that |A(s, t)| 4 |B(s, f)| 
for allt ER. 


Exercise 710 

Let n be a positive integer and for all 1 < j <n, let m; be a positive inte- 
ger. Define the matrix A = [ajj] € Mnxn(Q) by setting ajj = Ge) for all 
1 <i, j <n. Calculate |A|. 


Exercise 711 
Let a and b be distinct elements of a field F and let n be a positive integer. Let 
A(n) = [aij] € Maxn(F) be the matrix defined by 


pve a ifi=j, 
~~ |b otherwise. 


Use induction on n to prove that |A(m)| = [a+ (n — I)b](a — byt, 


Exercise 712 
Let n be a positive integer and pick integers | < h,k <n. Let f, g € R® be the 
functions defined by 


[Encl ife #0, 
fro {f ifc=0 


and g:ct> |Enx-c|. Are these functions continuous? 


Exercise 713 

Let n be a positive integer and let A € My x»(Q) be a nonsingular matrix the 
entries of which are integers and the determinant of which is +1. Show that all 
of the entries of A~! are integers. 
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Exercise 714 
Let F be a field and let A € M2 x2(F). Show that the matrix AZ + |A|Z belongs 
to the subspace of M2,2(F) generated by {A}. 


Exercise 715 

Let A = [aj] € M3x3(Q) be a matrix all of the entries of which are nonnegative 
one-digit integers. Let d be a positive integer dividing the three-digit integers 
11Q12413, 421422423, and a31a32a33. Show that d divides | A]. 


Exercise 716 

Let n be a positive integer and let F be a field. Let A € My xn(F) be a matrix 
satisfying the condition that |A + B| = |A| for all B € Myxn(F). Show that 
A=0O. 


Exercise 717 

Let n be an odd positive integer let A € M,x,(R). Show that there exists a 
diagonal matrix B the diagonal entries of which are +1 such that A + B is non- 
singular. 


Exercise 718 

Let n > 1 be an integer and let F be a field. Show that there exist subspaces W 
and Y of Mnxn(F) satisfying Mnxn(F) = W @ Y such that the restrictions of 
the determinant function 6, to W and to Y are linear transformations. 


Exercise 719 

Let n > | be an integer and let B be the set of all of the nonsingular matrices in 
Mnxn(R) all of the entries of which are either 1 or 0. Show that in every matrix 
in B there are at least n — 1 entries which are equal to 0 and that there exists a 
matrix in B in which there are precisely n — | entries equal to 0. 


Exercise 720 
Let A be a matrix formed by permuting the rows or columns of an Hadamard 
matrix. Is A necessarily an Hadamard matrix? 


Exercise 721 

Let V be a vector space of finite dimension n over a field F and let {v1,..., vn} 
be a given basis for V. Let U be the subset of V consisting of all vectors of 
the form yg = )~"_, a'~!v;, for 04.4 € F. Show that any subset of U having n 
elements is a basis for V. 


Exercise 722 
Let F be a field and let n be an even positive integer. Let A € M,x,(F) bea 


matrix which can be written in block form as [A;;], where Ajj = : q| if 


—ci O 
0 0 : 
otherwise. Calculate the Pfaffian of A. 


j= jand Ay =[5 0 
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Exercise 723 
Let F be a field and let n be an even positive integer. Let c € F and let 
A € Moaxn(F) be a skew-symmetric matrix having Pfaffian d. What is the Pfaf- 
fian of cA? 


Exercise 724 
Letc= 5(1 + i4/3). Find the set of all real numbers a such that 


a 1 1 
1 c cleR 
1 cece 


Exercise 725 


For elements a, b,c, d of a field F, calculate the value of 


Yo Fs 
a Aa & 
Sera ana 
gro & 


Exercise 726 

Let F be a field and let n be a positive integer. Set V = My xn(F) and let 
A,B eV satisfy the condition that |AB| = 1. Then the functiona: Che ACB 
is an endomorphism of V satisfying |C| = |a(C)| for all C € V. Find an endo- 
morphism of V satisfying the same condition, which is not of this form. 


Exercise 727 

Let F bea field and let n be a positive integer. For A, B, C, D€ Myxn(F), show 
A B —C -D 

C D A By 


that 


Exercise 728 
Let F be a field and let A, B,C, D€ Myxn(F) for some positive integer n. If 


CD=DC and |D| #0, show tat | p|=/4D- Bc 


Exercise 729 
If F is a field and A = [ajj] € Mnxn(F) for some positive integer n, then we 
define the permanent of A to be 


> An (1),1 © A7(2),2 @°** @ Az(n),n- 
TESy 


(i) Show that the permanent of A is the coefficient of X;---X, in the polyno- 
mial jai X1 +++++4inXn) € F[X1,..., Xn]. 
(ii) If A is a permutation matrix, what is its permanent? 
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Exercise 730 
Does there exist a matrix A € M5,.5(Q) the permanent of which equals 120? 


Exercise 731 
Let F bea field. For any matrix A = [a;;] € M2 2(F), let U(A) be the set of all 
matrices B € M2 x2(F) satisfying |A + B| = |A| +|B|. Is U(A) a subspace of 
M2x2(F)? 


Exercise 732 
Let F be a field and let A € M2,2(F). Find a necessary and sufficient condition 
for |J + A] = 1+ |A| to hold. 


Exercise 733 
Let F be a field and let n be a positive integer. If A, B,C, DE Myxn(F) with 


D nonsingular, show that ie | =|AD-—BD"'cD\. 


Exercise 734 
Let a, b,c € Z. Find a positive integer n such that 


2c a+b+c at+b+c 
atb+c na atb+c 
atb+c at+b+c 2b 


is divisible by abc. 
Exercise 735 


Let n be a positive integer and let A € Myxn(Q) be a matrix all entries of which 
are integers and satisfying |A| = +1. Show that all entries of A~! are integers. 


Exercise 736 
Find the Padé approximant to x > e* of type 2/4. 


Exercise 737 
Let a, b, and c be elements of a field F’. Find an element x of F' such that 


1 a a 1 -b-c be 
1 x b*/=J]1 -c-—a ca 
lc ¢e 1 -a—b ab 


Eigenvalues and Eigenvectors 1 2 


One of the central problems in linear algebra is this: given a vector space V finitely 
generated over a field F, and given an endomorphism a of V, is there a way to 
select a basis B of V so that the matrix gz(a@) is as nice as possible? In this 
chapter, we will begin by defining some basic notions which will help us address 
this problem. 

Let V be a vector space over a field F and let a € End(V). A scalar c € F is 
an eigenvalue of a if and only if there exists a vector v # Oy satisfying a(v) = cv. 
Such a vector is called an eigenvector! of a associated with the eigenvalue c. Thus 
we see that a nonzero vector v € V is an eigenvector of a if and only if the subspace 
Fv of V is invariant under a. Every eigenvector of a is associated with a unique 
eigenvalue of a but any eigenvalue has, as a rule, many eigenvectors associated 
with it. The set of all eigenvalues of a is called the spectrum of a and is denoted 
by spec(a). Thus, c € spec(a) if and only if the endomorphism co; — a of V is not 
monic. 


Example If V is a vector space over R and if a € End(V) satisfies ~7 = —o1, then 
spec(a) = ©. To see this, note that if v is an eigenvector corresponding to an eigen- 
value c then —v = a?(v) = c’v and so (c? + l)v = Oy, implying that ce=-l, 
which is impossible for a real number c. In particular, if « € End(IR*) is defined by 


Qa: | > Hw then spec(a) = ©. 


'The terms “eigenvalue” and “eigenvector” are due to Hilbert. Eigenvalues and eigenvectors are 
sometimes called characteristic values and characteristic vectors, respectively, based on termi- 
nology used by Cauchy. Sylvester coined the term “latent values” since, as he put it, such scalars 
are “latent in a somewhat similar sense as vapor may be said to be latent in water or smoke in a 
tobacco-leaf”. 
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Example Leta € End(R?) be defined bya: | med Hi Then c € spec(a) if and 


only if there exists a vector | satisfying H a ls: Therefore, we see that 


spec(a) = {—1, 1}, where | is an eigenvector of a associated with —1 and 


al. : : ; 
H is an eigenvector of @ associated with 1, for anyO4aeER. 


Example Let V be the vector space of all infinitely-differentiable functions from R 
to itself and let 6 be the endomorphism of V which assigns to each such func- 
tion its derivative. Then a function f, which is not the 0-function, is an eigen- 
vector of 5 if and only if there exists a scalar c € R such that 5(f) =cf. For 
any real number c, there is indeed such a function in V, namely the function 
x +» e. Thus spec(5) = R. The set of all eigenvectors of 5 associated with c is 
{ae | a # 0}. This fact has important applications in the theory of differential 
equations. 


The first use of eigenvalues to 
study differential equations is 
due to the French mathemati- 
cian Jean d’Alembert, one 
of the foremost researchers 
of the eighteenth century. Im- 
portant solutions of eigen- 
value problems for second- 
order differential equations were obtained in the nineteenth century by Swiss mathematician 
Charles-Francois Sturm and French mathematician Joseph Liouville. 


Let a be an endomorphism of a vector space V of a field F having an eigen- 
value c. If 8 € Aut(V) then c is also an eigenvalue of Bap-!. Indeed, if v is an 
eigenvector of a associated with c then Bap-'(B(v)) = Ba(v) = B(cv) = cB(v) 
and B(v) ~ Oy since 6 is an automorphism. Therefore, 6(v) is an eigenvector of 
BaB-! associated with c. 

Similarly, let p(X) = 779 b, X' € F[X]. If v € V is an eigenvector of a associ- 
ated with an eigenvalue c, then v is also an eigenvector of p(a~) € End(V) associated 
with the eigenvalue p(c), since p(a@)v = )~"_p bia! (v) = )7"_g bic'v = p(c)v. In 
particular, we see that, for any positive integer n, the vector v is an eigenvector of 
a” associated with the eigenvalue c”. 

Let V be a vector space over a field F and let a be an endomorphism of V. 
A vector v € V is a fixed point of a if and only if a(v) = v. It is clear that Oy isa 
fixed point of every endomorphism of V and a nonzero vector v is a fixed point of 
a if and only if 1 € spec(@) and v is an eigenvalue of a associated with 1. 
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Proposition 12.1 Let V be a vector space over a field F and let a be an 
endomorphism of V having an eigenvalue c. The subset W composed of Oy 
and all eigenvectors of a associated to c is a subspace of V. 


Proof If w,w’' € W anda F then a(w + w’) =a(w) + a(w’) =cw+cw' = 
c(w+w’) and a(aw) = aa(w) = a(cw) = c(aw) and so w+ w’, aw € W, proving 
that W is a subspace of V. 


Let V be a vector space over a field F and let a be an endomorphism of V 
having an eigenvalue c. The subset W composed of Oy and all eigenvectors of a 
associated with c, which we know by Proposition 12.1 is a subspace of V, is called 
the eigenspace of a associated with c. In particular, if | is an eigenvalue of a then 
the fixed space of a is the eigenspace associated with 1. If 1 ¢ spec(@) then the fixed 
space of a is taken to be {Ov}. 


a a 
Example Define a € End(R?) by a: | b | t | 0 |. Then 1 € spec(@) and the 
c c 
1 0 
eigenspace of a associated with | (namely the fixed space of ~)isR {| 0 | ,| 0 
0 1 


Example Small errors in recording data may lead to considerable errors in the calcu- 
lation of eigenspaces, even if the eigenvalues are calculated correctly. For example, 
let a,b,c,e € R and let a and £ be the endomorphisms of R? represented with 


a 0 0 aee 
respect to some fixed basis by the matrices |} 0 b Oj} and| 0 D e |, respec- 
0 0 ¢ 0 0 ¢ 
tively. Then spec(a) = spec(8) = {a, b, c}. The eigenspaces of a associated with a, 
1 0 0 
b,careR| 0], R} 1 |, andR] 0 |. The eigenspaces of 6 associated with a, b, 
0 0 1 
1 e e(e+c—b) 
careR|}0],R| b-—aj,andR e(c—a) 
0 0 (c—a)(c —b) 


Example Let V = C(O, 1) and let a be the endomorphism of V defined by 


a(f): xh Vi cos(a[x — t]) f(t) dt for all f € V. To find the eigenvalues of a, 
recall the trigonometric identity 


cos(z[x — t]) = cos(zx) cos(zt) + sin(zx) sin(zt). 
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Using this identity, we see that if f € V then 


1 1 
a(f)i:xh | cos(zt) f (t) ar| cos(zx) + | sin(zt) f (t) ar| sin(z x) 
0 0 


and so the image of q@ is contained in the subspace W = R{g1, g2} of V, where 
g1 xt cos(zx) and gz: xb sin(zx). It is easy to see that a(g1) = 581 and 
a(g2) = 5 go, so both of these functions are eigenvectors of @ associated with 
the eigenvalue 5: Moreover, {g1, 82} is linearly independent. Thus we see that 
spec(a@) = {5} and the eigenspace associated with this sole eigenvalue is W. 


Proposition 12.2 Let V be a vector space finitely generated over a field F 
and let abe an endomorphism of V . Then the following conditions on a scalar 
c are equivalent: 

(1) c is an eigenvalue of a; 

(2) coy —a € Aut(V); 

(3) If A= ®gp(Q) for some basis B of V, then |cI — A|=0. 


Proof (1) + (2): Condition (1) is satisfied if and only if there exists a nonzero vector 
v € V satisfying a(v) = cv, i.-e., if and only if (coy — w)(v) = Oy. This is true if and 
only if ker(co, — a) ¥ {Oy}. Since V is finitely generated, by Proposition 7.3, we 
know that this is true if and only if condition (2) holds. 

(2) < (3): This is a direct consequence of the fact that a matrix is nonsingular if 
and only if its determinant is nonzero. 


From Proposition 12.2, we see how to define eigenvalues of square matrices over 
a field: if F is a field and n is a positive integer, then c € F is an eigenvalue of a 
matrix A € My xn(F) if and only if |cJ — A| = 0, namely if and only if the matrix 
cl — A is singular. The set of all eigenvalues of A will be denoted by spec(A). In 
particular, we observe that a matrix A is nonsingular if and only if O ¢ spec(A). 


0 
A vector | : | #u € F” is an eigenvector of A associated with the eigenvalue c if 
0 
0 
and only if Av = cv. The subset of F” consisting of | : | and all eigenvectors of 
0 


A associated with c is a subspace of F” called the eigenspace associated with c. In 
the case that F equals R or C, the number o(A) = max{|c| | c € spec(A)} is called 
the spectral radius of the matrix A, and plays a very important part in the numerical 
analysis of matrices. Note that if F = C, then (A) is just the radius of the smallest 
circle in the complex plane, centered at the origin, containing spec(A). Moreover, 
since spec(A) consists precisely of the poles of the function z+ |zJ — A|~!, this 
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observation allows the use of powerful techniques of complex analysis in the study 
of the spectra of complex matrices. 

Calculating the spectra of matrices is a critical tool in many applications of math- 
ematics. Thus, for example, in statistics one learns that finding the spectrum of co- 
variance matrices is an integral part of several data analysis techniques. 


Example It is not necessarily true that 0(AB) = p(A)p(B) for square matrices A 
and B. For example, if A = F | and B= E 0| in M),.2(R), then p(A) = 
0 = e(B), whereas p(AB) = 4. 

Given a matrix A € Myxn(F), we note that |cJ — A] = |(cI — A)"|=|cI — A? | 


and so spec(A) = spec(A’). However, for each such common eigenvalue, the asso- 
ciated eigenvectors may be different. 


1 1 -2 
Example LettA=|-1 2 1 | € M3,3(R). Then spec(A) = {—1, 1, 2} and so 
Oo 1 - 
this is also spec(A’). 
1 
(1) The eigenspace of A associated with —1 is R | 0 | and the eigenspace of A? 
1 
1 
associated with —1 is R 2) 
-—7 
3 
(2) The eigenspace of A associated with 1 is R| 2 | and the eigenspace of A? 
1 
-1 
associated with 1 is R 0}; 
1 
1 
(3) The eigenspace of A associated with 2 is R | 3 | and the eigenspace of A? 
1 


associated with 2 is R 1 
1 


It is interesting to note the following. Let F be a field and let n be a positive 
integer. If v, w € F”, then vA w=vw! € Myyn(F) and vOw =v! we F. Direct 
calculation then yields (v A w)v = (v © w)uv, showing that v is an eigenvector of 
v A w associated with the eigenvalue v © w. 


Example Let n be a positive integer and let A = [a,;] be ann x n Markov matrix, 
which we will consider as an element of M,.x;,(C). We claim that o(A) < 1. Indeed, 
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by 
let c € spec(A) and let v=] : | € C” be an eigenvector associated with c. Let 
Dn 


1 <h <n satisfy the condition that |b;| <|b,| for all 1 <i <n. Then Av =cv 
implies, in particular, that )°""_) anjbj = cbp and so 


n n 
< )oanj\bjl < (Sn) = [Pal 
j=l j=l 


n 


S> anjbj 


j=l 


Ie] - |bn| = |ebnl = 


Hence |c| < 1, as claimed. 


Example Let n be a positive integer and let A € My x.,(R) be a skew-symmetric 
matrix. We claim that spec(A) C {0}, with equality when n is odd. Indeed, let 
c € spec(A) and let v € R” be an eigenvector of A associated with c. Then 
—A'y = Av =cv and so —A! (Av) = —A’ (cv) = c(—A’ v) = c’v. Therefore, 


by 
—(Av © Av) = —-v! Al Av=c?v! v =c2(v © v). But ify=] : | is any vector 
by 
0 
in R”, then y © y = )7j_, b? = O, with equality if and only if y= | : |. Since 
0 


v is nonzero, we conclude that we must have c* = 0 and so c = 0. Therefore, 
spec(A) C {0}. If n is odd then, by the remark after Proposition 11.7, we know 
that A is singular and so 0 € spec(A), establishing equality. 


Example Let n be a positive integer and let A € My x7(C). If c is a nonzero eigen- 
value of A and if v € C” is an eigenvector associated with c then, by Propo- 
sition 11.13, we know that |A|v = adj(A)Av = c[adj(A)]v and so [adj(A)]v = 
c7!|A|v. Thus v is also an eigenvector of adj(A) associated with the eigenvalue 
co Al. 


If F is a field, if n is a positive integer, and if A € Myy,(F) is a matrix 
having eigenvalue c, then |cJ — A| = 0 and so, by Proposition 11.13, we have 
(cI — A)adj(cI — A) = O, whence Al[adj(cI — A)] = c[adj(c7 — A)]. From this 
we conclude that each of the columns of adj(cJ — A) must belong to the eigenspace 
of A associated with c. 


0 1 0 
Example Let A = | 0 0 1) € M3 x3(R). Then one can calculate that 
4 -17 8 
1 —-4 1 
spec(A) = {2 — /3,2 + V3, 4}. Moreover, adj(47—A)=| 4 -16 4] and 
16 -—64 16 
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it is easy to check that the columns of this matrix are indeed eigenvectors of A 
associated with 4. 


Proposition 12.3 [f V is a vector space finitely generated over a field F and 
ifa, B € End(V) then spec(aB) = spec(Ba). 


Proof Let c € spec(a@B). If c= 0, this means that a6 ¢ Aut(V). Therefore, either a 
or 6 is not an automorphism of V, and so Ba ¢ Aut(V) as well. Therefore, we can 
assume c # 0. Let v be an eigenvector of wf associated with c and let w = B(v). 
Then a(w) = aB(v) = cv 4 Oy and so w ¥ Oy. Moreover, Ba(w) = BaB(v) = 
B(cv) = cB(v) = cw and so w is an eigenvector of Ba associated with c. Thus 
spec(@B) C spec(6a). A similar argument shows the reverse inclusion, and so we 
have equality. 


In particular, as a consequence of Proposition 12.3, we see that if F is a field, if 
n is a positive integer, and if A, B € Myxn(F) then spec(AB) = spec(BA). 

As we noted at the beginning of the chapter, if we are given a vector space V 
finitely generated over a field F and an endomorphism a of V, we would like to 
find, to the extent possible, a basis B of V such that the matrix ®g,(q) is nice, in 
the sense that it is amenable to quick and accurate calculations. Let V be a vector 
space over a field F (not necessarily finitely generated) and let a €¢ End(V). Then a 
is diagonalizable if and only if there exists a basis B of V composed of eigenvectors 
of a. 


Example We have already seen that the set B of all functions in R® of the form 
xt e™, for some a € R, is linearly independent. Therefore, W = RB is a sub- 
space of R® which is not finitely generated, and B is a basis for W. Let w be the 
endomorphism of W which assigns to each f € W its derivative. Since each element 
of B is an eigenvector of a, we see that a is diagonalizable. 


The following result characterizes the diagonalizable endomorphisms of finitely- 
generated vector spaces. 


Proposition 12.4 Let V be a vector space finitely generated over a field 
F and let a € End(V). Then the following conditions on a basis B = 
{U1,..-, Un} are equivalent: 

(1) vu; is an eigenvector of a for each \1 <i <n; 

(2) ®gp(a@) is a diagonal matrix. 


Proof (1) => (2): By (1), we know that for each 1 <i <n there exists a scalar c; 
satisfying a(v;) = c;v; and so, by definition, g(a) is the diagonal matrix [a;;] 
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given by 


ci wfi= J, 
aij = ‘ 
O otherwise. 


(2) = (1): If g(a) = [a;;] is a diagonal matrix then for each 1 <i <n we 
have a(v;) = ajjvj and so v; is an eigenvector of a for each 1 <i <n. 


Let V be a vector space over a field F and let a € End(V). If B is a basis of 
V made up of eigenvectors of @ then, as we have seen above, the elements of B 
are also eigenvectors of p(a) for any polynomial p(X) € F[X]. We need not stick 
to polynomials: suppose that each v € B is an eigenvector of @ associated with an 
eigenvalue c, of a. Given any function whatsoever f : spec(a) — F, we can de- 
fine the endomorphism f(a) of V by setting f(a): neg dv¥ > yep af (cr)v 
and the elements of B are also eigenvectors of f(a). We note that if f and g are 
functions from spec(a) to F then f(a@)g(a) = g(a) f(a). 

Now assume that V is finitely generated over F and that B = {v1,...,v,} isa 
basis of V made up of eigenvectors of a € End(V). For each | <i <n, let cj be 
the eigenvalue of a associated with v;. We have already seen that for each such 
i there exists a polynomial p;(X), namely the Lagrange interpolation polynomial, 
satisfying the condition that 

1 ifi=j, 
Pie) = to shee 
Thus, given a function f : spec(a) — F, the polynomial p(X) = >, f (ci) pi(X) 
satisfies p(c;) = f(c;) for all | <i <n, and so p(a) = f(a). Thus, for finitely- 
generated vector spaces, the above generalization does not in fact contribute any- 
thing new; it is important, however, in the case of vector spaces which are not finitely 
generated. 

We now show that the size of the spectrum of an endomorphism of a finitely- 

generated vector space is limited. 


Proposition 12.5 Let V be a vector space over a field F and let a € End(V). 
If c1,..., cx are distinct eigenvalues of a and if v; is an eigenvector of a 
associated with cj for each | <i <k, then the set {vj,..., vg} is linearly 
independent. 


Proof Assume that the set {v1,..., vz} is linearly dependent. Since vy 4 Oy, we 
know that the set {v1} is linearly independent. Thus there exists an integer 1 < t <k 
such that the set {v1, ..., v;} is linearly independent but {v1, ..., v¢+1} is linearly de- 
pendent. In other words, there exist scalars a), ..., @;+1, not all of which are equal 
to 0, such that eae ajv; = Oy and so Oy = Geri 2) ait) = Yt aicr41 Ui. 
On the other hand, Oy = iOwEE ajv;) = pean aja(v;) = yar a;cj;v;. Therefore, 
Oy = =o AjC{Vi — paar: GiCt+10j = ae aj (Cj — Cr-41)v;. But the set {v1,..., vz} 
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is linearly independent and so a; (c; — c;+1) = 0 for all | <i < tf. Since, by assump- 
tion, c; ~ c741 for all 1 <i <t, we have a; = 0 for all 1 <i <t and hence a4; =0 
as well, which is a contradiction. Thus {v;,..., vz} must be linearly independent. 


Thus we see that if F is a field and if A € My x»(F), then spec(A) can have at 
most 1 elements. In particular, if F has more than n elements, then there exists an 
0 0 
element c € F \ spec(A), and so (cI —A)v#]| : | forallu#| : |. This implies 
0 0 
that cJ — A is nonsingular. 
From Proposition 12.5, we see that if @ is a an endomorphism of a vector space 
V over a field F having distinct eigenvalues c;,...,c;,, and if W; is the eigenspace 
associated with c; for all 1 <i <r, then the collection {W),..., Wx} of subspaces 
of V is independent. Moreover, if V is finitely generated over F then the number of 
elements in spec(a@) is no greater than dim(V). 


Proposition 12.6 Let V be a vector space of finite dimension n over a field F . 
Then any endomorphism a of V having n distinct eigenvalues is diagonaliz- 
able. 


Proof This is a direct consequence of Proposition 12.4 and Proposition 12.5. 


Example Let « € End(R2) be defined by a: | = i 7 ik Then a (| }) = 

Bl and so H is an eigenvector of a associated with the eigenvalue 2. Also, 
1 4 1}. : : : P 

oe and so _; | isan eigenvector of a associated with the eigen- 


value 4. Thus B = (1): | -1]} is a basis for R? and Dzpp(a) = k A 


Example Let « € End(R2) be defined by a: A = eal If | Zz fa | aha 


b 


b=0 and c = 1. Thus spec(@) = {1} and the eigenspace associated with this sole 


a ({7]) =¢ | then cb = b and a+ b =ca, and this can happen only when 


eigenvalue is R k 


IR* made up of eigenvectors of a, and hence a is not diagonalizable. 


| Since this is not all of R?, we know that there is no basis of 


Note that the converse of Proposition 12.6 is false, as we easily see by taking 
a =o}. 
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From the above, we know that if A € M,.,(R) then the matrix has at most n 
distinct eigenvalues. However, it may have many fewer than that. If we assume that 
the entries of this matrix were chosen independently and randomly from a standard 
normal distribution, how many distinct eigenvalues should we expect? American 
mathematicians Alan Edelman, Eric Kostlan, and Michael Shub have shown that if 
€, denotes the mathematical expectancy for the number of eigenvalues of such a 


matrix in R, then limy_,5 Tien = 7 . The situation over the complex numbers 
is quite different. Given a matrix A € M,,,(C) one can, with probability 1, pick 
a matrix B € M,,y,(C) as near to A as we wish, which has n distinct eigenvalues 
in C. 

If F is a field, if m is a positive integer, and if A € Mnyn(F), then we can 
consider the matrix of polynomials XJ — A € Myxn(F[X]). The determinant of 
this matrix, |X 7 — Al, is a polynomial in FLX] called the characteristic polynomial 
of A. Note that this polynomial is always monic and of degree n. 


1 -1 0 
Example The characteristic polynomial of | 2 1 5] € M3 y3(R) is X? — 
4 2 1 
3X? —5X +27. 
121 2 
as : 0 1 2 3 : 
Example The characteristic polynomial of A = 3 | € Ma,4(R) is 
1 1 2 0 


X4— 3X3 — 11X* — 25X — 15. If we sketch the graph of the polynomial function 
tr t* — 343 — 1142 — 25t — 15, we see that it has real roots in the neighborhoods 
of —0.8 and 5.8. (More precisely, they are approximately equal to —0.8062070604 
and 5.7448832706.) These are the only real eigenvalues of the matrix A. 


1111 
re . 2 0 1 0 

Example Let F = GF(3). The characteristic polynomial of A = 0110 € 
1 1 1 0 


May4(F) equals X44 X3741H= (X+ 2)(x3 +2X24+2X + 2) and so A has only 
one eigenvalue, namely 1. 


5 4 2 
Example The characteristic polynomial of A= |4 5 2] in M3x3(Q) is 
22 2 
(X — 10)(X — ig and so spec(A) = {1, 10}. The eigenspace of A associated with 10 
2 -1 -1 
is Q| 2 |, while the eigenspace of A associated with | is Q 1], 0 


1 0 2 
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1 1 
polynomial of A is p(X) = X* 4X +1 and, since p(O) = p(1) = 1 we see that 
spec(A) = ©. In fact, it is possible to show that for every prime integer p there is a 
symmetric 2 x 2 matrix A over GF(p) satisfying spec(A) = ©. Later, we will show 
that any symmetric matrix over R must have an eigenvalue. 


Example Let F = GF(2) and let A = E | € M2x2(F). The characteristic 


Example Let a be the endomorphism of C? represented with respect to the canon- 

: : : 1 ’ i € M2xx(C). The characteristic poly- 

nomial of A is (X — 1) and so spec(A) = | The eigenspace associated with it is 
i 


C 1 i which has dimension 1. Therefore, a is not diagonalizable. 


ical basis by the matrix A = 


Proposition 12.7 Let F be a field and let n be a positive integer. If 
A € Myxn(F) has characteristic polynomial p(X) = pear 9s then 
|A| = (—1)"ao. 


Proof We note that ay = p(0) = |0J — A| =| — A] = (—1)"|A] and so |A| = 
(—1)"ao. 


The speed with which we can compute the characteristic polynomial of a ma- 
trix depends on the speed with which we can multiply two matrices. In 1985, Swiss 
computer scientist Walter Keller-Gehrig showed that if we can multiply two n x n 
matrices over a field F in an order of n° operations, then we can calculate the char- 
acteristic polynomial of an n x n matrix over F in an order of n° log(n) operations. 
In 2007, French mathematician Clément Pernet and German/Canadian computer 
scientist Arne Storjohann constructed a new algorithm with an expected cost on the 
order of n‘, provided that the field F has at least 2n? elements. If one has the use of 
a computer with n> parallel processors, then much faster computation times can be 
obtained. 

Any monic polynomial in F[X] of positive degree is the characteristic polyno- 
mial of some square matrix over F’. To see this, consider a polynomial p(X) = 
>. ai X', for n > 0. If p(X) is monic, define the companion matrix of p(X), 
denoted by comp(p) € Mnxn(F), to be the matrix [a;;] given by 


1 ifi=j+1andj <n, 
aij=4—-a-1 if j=n, 
0 otherwise. 


Otherwise, define comp(p) to be comp(a, Da). 
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served (Macduffee). 

Companion matrices were first studied at the 
beginning of the twentieth century by German 
mathematician Alfred Loewy. The term was first 
introduced by the twentieth-century American 
mathematician Cyrus Macduffee. 


Proposition 12.8 Let F be a field and let n be a positive integer. If p(X) = 
yr ai X' € F[X] is monic, then p(X) is the characteristic polynomial of 
comp(p) € Maxn(F). 


Proof We will proceed by induction on n. For n = 1, the result is immediate. If 


n=2andif p(X) = X*+a,X +ap, then comp(p) = E 2 and so the charac- 
a 
teristic polynomial of comp(p) is i x os = p(X) and we are done. Assume 
_ 1 

now that > 2 and the result has been established for n — 1. Then the characteris- 
xX O.... ao 
—-l1 xX... aj 

tic polynomial of comp(p) is . : . . By Proposition 11.11, 
O ... -l X+ay)_1 


n—1 


this equals X|comp(q)| + ag(—1)"""|Bl, where g(X) = Viz0 aj41X! and where 
B € Ma-1)x(n—1)(F) is an upper-triangular matrix with diagonal entries all equal 
to —1. Thus |B] = (—1)"! and, by the induction hypothesis, |comp(q)| = q(X). 
Thus the characteristic polynomial of comp(p) is Xq(X) + ag = p(X), as de- 
sired. 


Let F be a field and let 1 be a positive integer. Every nonsingular matrix 
PE Mhyxn(F) defines a function wp from My xn(F) to itself given by wp : AK 
P-'AP. In fact, wp € Aut(Myxn(F)), where oa = wp-1. This is an automor- 
phism of F'-algebras and, indeed, it can be shown that every automorphism of unital 
F-algebras in Aut(M),x»(F)) is of this form. Therefore, the set of all automor- 
phisms of the form wp is a group of automorphisms of M,.,(F) and so defines an 
equivalence relation ~ by setting A ~ B if and only if B = P~!AP. In this case, 
we say that the matrices A and B are similar. From what we have already seen, two 
matrices in M,..,(F) are similar if and only if they represent the same endomor- 
phism of an n-dimensional vector space over F with respect to different bases. One 
of the problems before us is to decide, given two square matrices of the same size, 
if they are similar or not. 

Note that if a matrix A € Mn xn(F) is similar to O, then it must equal O. Indeed, 
if P-'AP = O then A=(PP7~!)A(PP—'!) = P(P~'AP)P-!= POP"'!=0O. 
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Example In M3,.3(Q), the matrices 


20 10 10 80 130 100 
A=|10 0O 10 and B= 10 10 10 
10 10 10 —50 -—80 —60 
12 1 
are similar, since B= P~!AP, where P=|1 0 1 |. Thus we note thata sym- 
2 3 3 
metric matrix may be similar to a matrix which is not symmetric. 
1 0 0 1 1 0 
Example The matricesA=|]—1 1 1]andB=]0 1 O |] in M3 x3(Q) are 
-1 0 2 0 0 2 


not similar since, were they similar, the matrices A — J and B — I would also be 
similar, and thus have the same rank. But it is easy to see that the rank of A — J 
equals 1, while the rank of B — J equals 2. 


Example If matrices A, B€ My xn(F) are similar, it does not follow that they com- 


1 0 -1 1 0 0 
mute. For example, let A = 2 3 0) €M3,3(R). Then P=]}0 1 O 
-1 0 -2 01 1 
1 1 -1 
is nonsingular and so B= PAP-!' =| 2 3 0 | is similar to A. However, 
1 5 -2 


ABZ BA. 


Example Let F bea field and, for each 1 <h <t, let Ay, be a square matrix over F, 

which is similar to a square matrix By, over F’. That is to say, there exists a nonsin- 

gular square matrix P;, such that By, = Py An Pp, . Let A be the matrix in block form 
At O.... O 


O Ao... O 
. : . in which all blocks not on the diagonal are equal to O, and 
O O... At 
Bi, O O 
By ... O 
let B= . . ; _ |. Then A is similar to B, since B = PAP, where 
O O B; 
P, O O 
O Pr 


P=|., : ; . |. We will make us of this fact in the next chapter. 


O O.... P; 
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Proposition 12.9 Let F be a field and let k <n be positive integers. 
Let A€ Mnxn(F) be a matrix which can be written in block form as 


ee “ Ai2 , where Ay, © Mgxx(F) and Az. € Ma—k)x(n—k) (PF). Then 
22 


spec(A) = spec(A1) U spec(A22). 


Proof Let c € spec(A) and let v € F” be an eigenvector associated with c. Write 


v= [3 fe where v, € F* and v2 € F"~*. Then 
“ 


Avi + Aj202 = Ai Aj2 U1 =Ap— c= cv] 
A222 O Ar v2 cv2 | 


0 
From this we see immediately that if v2 # | : | then c € spec(A22), while if 
0 
0 
v2 = | : | then c € spec(Aj1). Therefore, spec(A) is contained in spec(A11) U 
0 
spec(A?). 


Conversely, let c € spec(Aj;) and let v; € F* be an eigenvector associated 

: vp} | Avy 
with Then | = O 
that d € spec(A22) \ spec(A11) and let v2 € F "Kk be an eigenvector associated 
with d. Since d ¢ spec(A11), we know that the matrix B = Aj; —dI € Myyx(F) is 


nonsingular. Set vy = B~!Aj2(—v2). Then (A — d/) | = eae 
0 


=c Be , proving that c € spec(A). Now assume 


, Showing that d € spec(A). Therefore, spec(A11) U spec(A22) € spec(A), 


0 
proving equality. 
1 1 5 6 
-1 1 7 3 
Example Let A= 0 > 11€ Max4(C). Then 
0 0 -4 3 


spec() = spee(| _j i }) useee([ 3 31) 


=u tssivisl]. 
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Proposition 12.10 Similar matrices in Myxn(F), where F is a field and 
where n is a positive integer, have identical characteristic polynomials. 


Proof If A, B€ Myxn(F) satisfy B = P~!AP then 
|XI — B|=|X1— P~'AP|=|P7'(XI— A)P| 
=|P|~'|XI— A||P|=|XI— Al, 


as required. 


Example The converse of Proposition 12.10 is false. Indeed, the matrices : | 


and ; "| are not similar, despite the fact that both of them have the same char- 


acteristic polynomial, (X — 1)?. 


A generalization of Proposition 12.10 tells us that if P, Q € Myxn(F) are non- 
singular matrices satisfying |PQ| = 1, then the endomorphism a of M,x,(F) 
given by apg : At> PAQ satisfies the condition that A and a(A) always have 
identical characteristic polynomials. The same goes for the linear transformation 
Beg: Awe PA! Q. Frobenius proved that any endomorphism of Myx»(C) which 
preserves characteristic polynomials must be of one of these two forms. Note that 
endomorphisms of the form apg or Bpg are in fact automorphisms of Myx (FP). 
They also satisfy the property that apg(A) is singular if and only if A is singular, 
and similarly Bp g(A) is singular if and only if A is singular. Indeed, Dieudonné has 
shown that, for any field F, an endomorphism of M,.,(F) satisfying this condition 
must be of one of these two forms. 


= ~—« With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

The twentieth-century French mathematician Jean Dieudonné was 
one of the founders of the influential group who wrote under the col- 
lective name of Nicholas Bourbaki. 


Example If A and B are square matrices over a field F, then we know that the 
matrices AB and BA are not necessarily equal. They are also not necessarily similar. 
For example, if A = : and B= i : then AB = O # BA, and so AB 
and BA are not similar. Nonetheless, by Proposition 12.3, we see that spec(A B) = 


spec(BA). 
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Proposition 12.10 can be used to facilitate computation, as the following example 
shows. 


Example Let n be a positive integer, let F be a field, and let A = [ajj] € Mnxn(F) 
be a symmetric tridiagonal matrix. That is to say, the entries of A satisfy the condi- 
tion that a;; = aj; when |i — j| = 1 and a;; = 0 when |i — j| > 1. Set po(X) = O and, 
for each 1 < k <n, let px(X) be the characteristic polynomial of the k x k submatrix 
of A consisting of the first k rows and first k columns of the matrix XJ — A € F[X]. 
Then p,(X) is the characteristic polynomial of A and we have p(X) = X — aj 
and p(X) = (X — age) pe-1(X) — aj; Pk—2(X) for each 2 < k <n. This recursion 
relation allows us to compute the characteristic polynomial of A quickly. There- 
fore, if A is any symmetric matrix, a good strategy is to try and find a symmetric 
tridiagonal matrix similar to it and then compute its characteristic polynomial. 


Let a be an endomorphism of a vector space V finitely generated over a field F 
and let c € spec(a). The algebraic multiplicity of c is the largest integer k such that 
(X — c)* divides the characteristic polynomial of w. The geometric multiplicity of c 
is the dimension of the eigenspace of w associated with c. The geometric multiplicity 
of c is not greater than its algebraic multiplicity, but these two numbers need not be 
equal, as the following examples show. If these two multiplicities are equal, we say 
that c is a semisimple eigenvalue of a; an eigenvalue which is not semisimple is 
defective. In particular, if the algebraic multiplicity of c is 1 then the same must be 
true for its geometric multiplicity. In that case, we say that c is a simple eigenvalue 
of a. If at least one eigenvalue of a has geometric multiplicity greater than 1, then 
a is derogatory; otherwise, it is nonderogatory. 


Example If a € End(R7) is defined by a: H a ke 4 ’] then c = | is an eigen- 


value of a with associated eigenspace R H and so the geometric multiplicity of 


c is 1. On the other hand, a is represented with respect to the canonical basis by 


0 1 
algebraic multiplicity of c is 2. 


the matrix F il so its characteristic polynomial is (X — 1)*, implying that the 


Example Let a € End(R*) be the endomorphism represented with respect to the 


2 3 1 
canonical basis by the matrix | 3. 2 4 |. The characteristic polynomial of a 
0 0 -1 


is (X —5)(X + hg and so spec(a) = {—1,5}, where the algebraic multiplicity of 

—1 equals 2 and the algebraic multiplicity of 5 equals 1. The eigenspace associated 
-1 1 

with —l is R 1 | and the eigenspace associated with 5 is R| 1 |. Thus both 
0 0 
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eigenvalues have geometric multiplicity 1. Hence, 5 is a simple eigenvalue of a 
whereas — | is defective. 


Let n be a positive integer. If a € End(R”) is represented with respect to a given 
basis of R” by a matrix all entries in which are positive, then Perron, using ana- 
lytic methods, showed that the eigenvalue of largest absolute value of a is simple 
and positive, and has an associated eigenvector all entries of which are positive. 
This result has many important applications in statistics and economics, especially 
in input-output analysis. It was also used by Thurston in his classification of sur- 
face diffeomorphisms in topology. Perron’s results were later extended by Frobe- 
nius to certain matrices all entries in which are nonnegative, and later by Karlin to 
certain endomorphisms of spaces which are not finite-dimensional. In 1948, Philip 
Stein and R.L. Rosenberg used Frobenius’ extension of Perron’s results to com- 
pare the convergence rates of the Jacobi and Gauss-Seidel iteration methods for 
solution of systems of linear equations. Their results have since been considerably 
extended. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Oberwolfach (Perron, Frobe- 
nius, Thurston). 

The twentieth-century German mathematician Oskar Perron worked in many areas of al- 
gebra and geometry. Fellow German mathematician Georg Frobenius is known for his 
important work in group theory and his work on bilinear forms. He was also the first to 
consider the rank of a matrix. William Thurston is a contemporary American geometer; 
the twentieth-century American applied mathematician Samuel Karlin published exten- 
sively in probability and statistics, as well as mathematical biology. 


Proposition 12.11 Let V be a vector space finitely generated over a field F 
and let abe an endomorphism of V satisfying the condition that the charac- 
teristic polynomial of a is completely reducible. Then a is diagonalizable if 
and only if every eigenvalue of a is semisimple. 


Proof Let spec(a) = {c1,..., cx}. First of all, we will assume that there exists a 
basis D of V such that ®pp(a) is a diagonal matrix. For each | < j <k, denote 
by m(j) the number of times that c; appears on the diagonal of ®pp(a). Then 
3a , m(j) =n and, by Proposition 12.4, we know that for each 1 < j < k there ex- 
ists a subset of D, having m(j) elements, which is a basis for the eigenspace of a as- 
sociated with c;. Moreover, the characteristic polynomial of a is rs (X—c;)™ Y) 
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and so m(j) equals both the algebraic multiplicity and the geometric multiplicity of 
cj; for each 1 < j <k, proving that each such c; is semisimple. Conversely, as- 
sume that each c; is semisimple, and for each 1 < j < k let m(j) be the algebraic 
(and geometric) multiplicity of c;. Let D; be a basis for the eigenspace of a asso- 
ciated with c;, and let D= Us _, D;. Then D is a linearly-independent subset of 
V having n elements, and so is a basis of V over F. The result then follows from 
Proposition 7.5. 


Example The condition in Proposition 12.11 that the characteristic polynomial of a 
be completely reducible is essential. To see this, consider the endomorphism a of R? 


0 1 0 
represented with respect to the canonical basis by the matrix A= | —1 0 0 
0 0 1 


The characteristic polynomial of a is (X — 1)(X* +1) € R[X] and so spec(a) = {1}, 

where | is a simple eigenvalue of w and so it is surely semisimple. The eigenspace 
1 

of @ associated with this eigenvalue is R | 0 | and so its dimension is 1. Hence @ is 
0 

not diagonalizable. 


Example Consider the endomorphism a of R? represented with respect to the 


-1 -1 -2 
canonical basis by the matrix 8 -—I11 —8 | and let 6 be the endomor- 
-10 Ii 7 


phism of R* represented with respect to the canonical basis by the matrix 
1 -4 -4 
8 —11 —8 |. These two endomorphisms have the same characteristic poly- 
—8 8 5 
nomial X3 + 5X? + 3X —9 =(X — 1)(X + 3)*. Thus the algebraic multiplicity 
of the eigenvalue | equals 1 and the algebraic multiplicity of the eigenvalue —3 
equals 2. But for a, the geometric multiplicity of —2 equals 1, so a is not diagonal- 
izable. On the other hand, for 6 the geometric multiplicity of —2 equals 2, and so B 
is diagonalizable. 


Let F be a field and let (K,e) be an associative unital F-algebra. If v © K 
and if p(X) = eae c;X! € F[X], then p(v) = ae civ! € K. For any polyno- 
mial g(X) € F[X] we have p(v) e g(v) = q(v) e p(v). In particular, v e p(v) = 
p(v) ev. It is clear that Ann(v) = {p(X) € F[X] | p(v) = Ox} is a subspace 
of F[X]. If p(v) = Ox, we say that v annihilates the polynomial p(X). 

In particular, we note that all of the above is true for the associative uni- 
tal F-algebra My xn(F), where n is a positive integer. We note that if A ~ B 
in Mnyxn(F) then there is a nonsingular matrix P such that B = P-'AP and 
so p(B) = P-' p(A)P so that if p(A) = O then p(B) = O. Thus we see that 
Ann(A) = Ann(B) whenever the matrices A and B are similar. 
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Example Let A = i 


i € M2 x2(R) and let p(X) = X* — X +2 € R[X]. Then 


p(A) = A?—A+21 = ki 3 | 400 = X*—2X —1 theng(A) = O soq(X)€ 
Ann(A). 


Proposition 12.12 Let F be a field and let (K,e) be an associative uni- 
tal F-algebra finitely generated over F. Then Ann(v) is nontrivial for each 
veK. 


Proof Let dim(V) =n. If v € K then {v9, Dy veg} cannot be a linearly in- 
dependent set and so there exist scalars do,...,dy,, not all equal to 0, such that 
ae ajv' = Ox. In other words, there exists a nonzero polynomial p(X) = 
yg ai X! in Ann(v). 


We now show why one cannot define “three-dimensional complex numbers”. 


Proposition 12.13 [fn is an odd integer greater than | then there is no way 
of defining on IR" the structure of an R-algebra which is also a field. 


Proof Assume that we can define an operation on R” (which we will denote by 
concatenation) which turns it into an R-algebra which is also a field, and let v; be 
the identity element for this operation. Then V ~ Rv; since dim(V) > 1. Pick an 
element y € V \ Rv, and let w € End(V) be given by a: vt» yu, which is repre- 
sented with respect to the canonical basis of R” by a matrix A. The characteristic 
polynomial p(X) of A belongs to R[X] and has odd degree; therefore, it has a root 
cin R. Thus p(X) = (X —c)*q(X) for some k > 1 and some g(X) € R[X] satisfy- 
ing q(c) 4 0. Let B € End(V) be given by B: v+> (y —cv1)*v. Then B ¥ a9 since 
y ¢ Ry, and Oy 4 q(c) = g(cv,). But then (y — cvy)*g(cvy) = Ov, contradicting 
Proposition 2.3(12). 


Let F be a field and let (K, e) be an associative unital F-algebra. If v € K sat- 
isfies the condition that Ann(v) is nontrivial then Ann(v) must contain a polyno- 
mial p(X) = )7_9 aiX ‘ of minimal degree. This means, in particular, that a, 4 0 
and so the monic polynomial a, ' p(X) also belongs to Ann(v). We claim that it 
is the unique monic polynomial of minimal degree in Ann(v). Indeed, if g(X) is 
a monic polynomial of degree n belonging to Ann(v) not equal to a, ' y(X), then 
r(X) =q(X) - a; | p(X) € Ann(v). But deg(r) <n, contradicting the minimality 
of the degree n of p(X). Thus we see that Ann(v), if nonempty, contains a unique 
monic polynomial of minimal positive degree, which we call the minimal polyno- 
mial of v over F and denote by m,(X). 
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Example We know that C is an associative unital R-algebra. If c=a+bieCrR, 
then its minimal polynomial over R is (X — c)(X —¢) = NP IGN a eb"), 


In particular, if F is a field and if n is a positive integer, then any matrix 
A € Mnhyxn(F) has a minimal polynomial, which we denote by m,4(X). If A and 
B are similar matrices, then m4(X) = mg(X). Similarly, if V is a vector space 
finitely generated over a field F’, and if a € End(V) then @ has a minimal polyno- 
mial my,(X), and this equals the minimal polynomial of ®pp(a) for any basis D 
of V. If f(X) € F[X] then it is easy to see that f(X) = mcomp(f)(X) and so every 
polynomial is the minimal polynomial of some matrix. 


Example Let (K,e) be an associative unital entire R-algebra. Assume that v € K 
has a minimal polynomial m,(X) € R[X]. By Proposition 4.4, we know that 
m,(X) = Tia pi(X), where the p;(X) are irreducible polynomials of degree at 
most 2. But then Tiles pi(v) = Ox and, since K is entire, there is some index h 
such that pyp(v) = Ox. By minimality, this means that m,(X) = p;,(X). We thus 
conclude that any element of v having a minimal polynomial has one of degree at 
most 2. 


Proposition 12.14 Let F be a field and let (K,e) be an associative 
F-algebra finitely generated over F. If v € K satisfies the condition that 
Ann(v) is nontrivial and if p(X) € Ann(v), then there is a polynomial 
q(X) € F[X] satisfying p(X) = my(X)q(X). 


Proof If p(X) is the 0-polynomial, pick u(X) to be the 0-polynomial, and we are 
done. Therefore, assume that deg(p) > 0. From Proposition 4.2, we know that we 
can write p(X) = my(X)q(X) +r(X), where g(X), r(X) € F[X], with deg(r) < 
deg(m,). Since p(v) = Ox, we see that Ox = my(v) e g(v) + r(v) = r(v). Since 
deg(v) < deg(m,), we must have deg(v) = —ov, and so p(X) = m,(X)q(X). 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


This fundamental result was first established at the beginning of the 
twentieth century by the German mathematician Kurt Hensel. 
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Proposition 12.15 Let F be a field and let (K,¢) be an associative unital 

F-algebra with multiplicative identity e. If v € K has a minimal polynomial 

m,(X) = ig ai X! then: 

(1) v is aunit of K if and only if ay 4 0; and 

(2) Ifv is aunit of K then vl= g(v), where g(X) = aca, ax E 
F[X]. 


Proof If ag 4 0 then m,(v) = Ox implies that e = ay '[- yaw] = 
ay '[- avi] ev =g(v) ev =ve g(v) and so v is a unit and v-! = 
g(v). Conversely, assume that v is a unit. Had we ag = 0, we would have 
Oy = my(v) = ve [oY_, ajv'—!] and so Ox = v-!m,(v) = 7, aiv'!. Thus 
aa a; X'—! € Ann(v), contradicting the minimality of the degree m,(X). Hence 
ao H 0. 


It is important to note that the minimal polynomial of a matrix over a field need 
not equal its characteristic polynomial. For example, if we consider 1 € Myxn(F) 
for any field F and any integer n > 1, then the characteristic polynomial of J is 
(X — 1)” whereas its minimal polynomial is X — 1. 


1 °| 
0 0 
polynomial X (X — 1), and this is in fact its minimal polynomial. It is also the char- 
acteristic polynomial of A. Thus we see that the minimal polynomial of a matrix 
does not have to be irreducible. Notice too that the rank of A equals 1, but the de- 
gree of its minimal polynomial is 2. Thus the degree of the minimal polynomial of 
a matrix may be larger than its rank. 


Example Let F be a field. The matrix A = € Mox2(F) annihilates the 


1 0 0 
Example Let F bea field. The matrix A=]|0 0 0O | €.M3x3(F) annihilates the 
0 0 0 


polynomial X (X — 1), and this is in fact its minimal polynomial. The characteristic 
polynomial of A is X7(X — 1). 


1 0 0 1 0 0 

Example One can check that |0 3 O}, |O0 1 Of € M3 x3(Q) are not 
0 0 3 0 0 3 

similar, but they both have the same minimal polynomial, namely (X — 1) - 

(X — 3). 


Example Proposition 12.15 can be used to calculate the inverse of a nonsingular 

matrix, though it is rarely the most efficient method of doing so. For example, the 
2-2 4 

matrix A = 2 3 2 | has minimal polynomial X? — 4X* + 7X — 10=0 
-1 1 -l 
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SO 
\ i: =e 2 S16 
AV= (4 -44+7)=7| 02 4 
Pl 5 0 10 


Proposition 12.16 (Cayley-Hamilton Theorem) Let F be a field and let n 
be a positive integer. Then every matrix in Mjxn(F) annihilates its charac- 
teristic polynomial. 


Proof Let A be a matrix in My xn(F) having minimal polynomial p(X) = X” + 

a;X'. Let us look at the matrix [g;;(X)] = adj(XI — A) € Mnxn(F[X]), 
where each g;;(X) is a polynomial of degree at most n — 1. Then we can write this 
matrix in the form ear B,X"~', where the B; are matrices in Maxn(F). More- 
over, we know that 


p(X)I =|XI — All = (XI — A)adj(XI — A) = (XI — A) (> axe) 
i=l 


Equating coefficients of the various powers of X, we thus see 
Bi =], 


By — AB, =ay,_1l, 
Bz — ABz = ay_2l, 


By — ABy-) = ay, 
—AB, = aol. 


For | <h <n, multiply both sides of the hth equation above on the left by A”+!~” 
and then sum both sides, to obtain O = p(A). 


We see from Proposition 12.14 and Proposition 12.16 that the minimal polyno- 
mial of any n x n matrix over a field divides its characteristic polynomial and so the 
degree of the minimal polynomial is at most n. 

Let V bea vector space finitely generated over a field F and let o9 #4 a € End(V). 
In Proposition 12.4, we saw that aw is diagonalizable if and only if there is a basis 
that is composed of eigenvectors of V. Moreover, if spec(a) = {c1,..., cx} and 
if, for each 1 <i <k, we denote the eigenspace of a associated with c; by W;, 
then for each 1 <i <k we have a projection z; € End(V) satisfying the following 
conditions: 

(1) im(z7;) = Wi; 
(2) mM +++ + =01; 


12 Eigenvalues and Eigenvectors 277 


(3) 2; j = 00 whenever i # j; 

(4) a=cym +---+ cer. 

For each | <h <k, let py(X) be the Ath Lagrange interpolation polynomial de- 
termined by cj,...,cg. Then we can check that 2, = py,(a) for each h, since 
Pn(X)(X — cp) is just a scalar multiple of the minimal polynomial of a. 

Is it possible to simultaneously diagonalize two distinct endomorphisms of V? 
Indeed, let V be a vector space finitely generated over a field F and let a and 
B be distinct elements of End(V) \ {oo}. There exists a basis D of V such that 
both ®@pp(@) and ®pp(f) are diagonal matrices if and only if the elements of D 
are eigenvectors of a as well as of 6. Suppose that we have in hand such a basis 
D= {uj,...,ux}. Since diagonal matrices commute with each other, we see that 
Ppp (aB) = Poppa) Ppp (B) = Gpp(B)Ppp(a@) = Ppp(Ba) and so af = Ba. 
Therefore, a necessary condition for both endomorphisms of V to be represented by 
diagonal matrices with respect to the same basis is that they form a commuting pair. 

We also note that if D is a basis for a vector space V over a field F then the set 
of all endomorphisms a of V satisfying the condition that ®pp(a) is a diagonal 
matrix is a subspace of End(V). Indeed, this is an immediate consequence of the 
fact that the set of all diagonal n x n matrices is a subspace of Myxn(F). 


Proposition 12.17 Let V be a vector space over a field F and let a, B be a 
commuting pair of endomorphisms of V. Then p(a)q(B) = q(B) p(@) for any 
P(X), q(X) € F[X]. 


Proof Initially, we will consider the special case of g(X) = X. If p(X) = 
>) aX! then Ba? = (Ba)a = (aB)a = a(Ba) = a(aB) = «7B, and, by induc- 
tion, we similarly have Ba* = a B for every positive integer k. Therefore 


Bp(@) = (> va!) =oaipo' =o aia'p = 2 aa!) = p(a)B. 


i=0 i=0 i=0 i=0 


Now a proof similar to the first part shows that p(w)6* = B* p(a) for every positive 
integer k and hence, by a proof similar to the second part, we get p(a)q(B) = 
q(B)p(a) for any p(X), q(X) € F[X]. 


As a consequence of this we note that if a, 8 €¢ End(V) are commuting projec- 
tions then (aB)* = (aB)(ap) = a(Ba)B = a(aB)B = a?h? = af and so af is a 


projection as well. 


Proposition 12.18 Let V be a vector space finitely generated over a field F 
and let a, B € End(V) be diagonalizable endomorphisms of V. Then there 
exists a basis of V relative to which both a and B can be represented by 
diagonal matrices if and only if aB = Ba. 
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Proof We have already noted that if w and f can both be represented by diagonal 
matrices with respect to a given basis of V then we must have wf = Ba. Con- 
versely, assume that a and # are diagonalizable endomorphisms of V satisfying 


aB = Ba. Then, as we have already seen, there exist distinct scalars cj,..., cx 
and projections 7 ,..., 7% € End(V) such that 7] + +--+ 7% = 01, ™j1j = 00 
for i ~ j, and cym, +--+: + cya, =a. Similarly, there exist scalars d,...,d; 


and projections 71,..., 7 € End(V) such that 9) + --- + = 01, ninj = 90 for 
ix j,and din; +---+dgng = B. Therefore, a = ao, = ‘oo CMI) j=1 njp= 
k k 
dist ae ejminj and B = Bo, = Qoj=1 djnj (ia Ti) = ae vist djnji- 
Since we saw that for each | <i <k we have z; = p;(a) for some p;(X) € F[X] 
and similarly for each 1 < j <?t we have n; = q;(8) for some g;(X) € F[X], we 
conclude that 2;n; = j;7; for each such i and j. Call this common value 6;;. By 
the comments after Proposition 12.17, we see that 6;; is also a projection in End(V). 
We note that 6;;@rm = inj Thm = TiN; Nm and this equals og when i ¥ j or 
h#m. Thus ae yi Oj = ope mi) j=1 nj) = 01. Hence we have shown 
that w and £ are simultaneously diagonalizable, using those projections 6;; which 
are nonzero (as some of them may be zero). 


We now turn to another classical result. 


Proposition 12.19 Let V be a vector space over a field F and let K bea 
subalgebra of End(V) such that there is no nontrivial proper subspace of 
V which is invariant under every a € K. Suppose that B € End(V) has a 
nonempty spectrum and commutes with every element of K. Then B = o; for 
somece F. 


Proof Pick c € spec() and let W be the eigenspace of £ associated with c. This is 
a nontrivial subspace of V. If a € K and w € W then Ba(w) = a8 (w) = a(cw) = 
ca(w) and so a(w) € W. Thus W is a nontrivial subspace of W invariant under 
every a € K and so, by assumption, it cannot be proper. Therefore, W = V and so 


B=o. 


Recall that if the field F’ is algebraically closed then any element of End(V) other 
than o9 has a nonempty spectrum. 


Proposition 12.20 Let V be a vector space over an algebraically-closed field 
F and let K be a unital subalgebra of End(V) such that there is no nontrivial 
proper subspace of V which is invariant under every a € K. Let v € V and let 
W bea finitely-generated subspace of V satisfying the condition that if a € K 
and W Cker(a@) then v € ker(a). Then v € W. 
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Proof We will prove the result by induction on n = dim(W). If n = 0 then W 
is trivial. Since K is unital, o) © K and W C ker(o;). Therefore, by hypothesis, 
v € ker(o,) and so v= Oy € W. 

Now assume, inductively, that n > 1 and that the result has been established for 
all subspaces of V of dimension less than n. Pick Oy 4 wo € W and let W; be a 
complement of Fwo in W. Set L = {a € K | W; C ker(a)}. This set is nonempty 
since og € L. Moreover, it is in fact a subspace of L as a vector space over F. 
Moreover, if a €¢ L and 6 € K then Ba € L, so in particular L is a subalgebra of K. 
Moreover, Y = {a(wo) | a € L} is a subspace of V. 

Since wo ¢ Wi, we know that there exists an element a of L satisfying 
wo ¢ ker(ao) and so Y is nontrivial. However, B(y) € Y foreach ye Y andBe K. 
Thus Y is invariant under every element of K and so, by hypothesis, Y = V. Define 
the function 6: V > V by 6: a(wo)  a(v). This function is well-defined for if 
1 (wo) = &2(wo) then wg € ker(a; — a2) and so W C ker(a, — a2). Hence, by as- 
sumption, v € ker(a1 — a2), i.e., a1(v) = a@2(v). It is straightforward to check that 
in fact 90 € End(V). 

If B € K then (68) (a(wo)) = @(Ba(wo)) = Ba(v) = B(a(wo)) = (BA) (a(wo)) 
and so 9 commutes with every element of K. By Proposition 12.19, this implies that 
6 =o, for some c € C. Thus, for any a € K we have a(v) = a(wo) = ca(wo) = 
a(cwo) and so a(v — cwo) = Oy. By the induction hypothesis, this implies that 
v — cwo € Wo and so uv € W, as desired. 


Proposition 12.21 (Burnside’s Theorem) Let V be a vector space finitely 
generated over an algebraically-closed field F and let K be a unital subal- 
gebra of End(V) the elements of which commute with all endomorphisms of 
the form o, for c € F. Assume furthermore that there is no nontrivial proper 
subspace of V which is invariant under every a € K. Then K = End(V). 


Proof Pick a basis {v1,..., Un} for V over F and, for all 1 <i, j <n, let 6;; be the 
endomorphism of V defined by the condition that 


Reigns A ifk =i, 

ap Oy otherwise. 
This is a basis for End(V) and so it suffices to show that 6;; € K forall 1 <i, j <n. 
Fix i € {1,...,} and let 


Lj = {a € K | a(vn) = Oy for allh 4 i}. 


By Proposition 12.20, there is an element ao of L; satisfying ao(v;) 4 Ov and so, as 
in the proof of that proposition, we see that {a(v;,) | a € L;} equals V. In particular, 
if | < j <n there exists an element 6; of L; satisfying Bj(v;) = v;. Thus 6;; = 
Bj eK. 
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With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Tate). 

The British mathematician William Burnside pub- 
lished important works on group theory at the end 
of the nineteenth century. Burnside’s original result 
has been extensively generalized. The above proof 
is based on the proof of one such generalization, 
by the twentieth-century American mathematician 
John Tate. 


Proposition 12.21 holds for the case of F = C. If the field F is not algebraically 
closed, this theorem may not hold. 


Example Let F = R and let a be the endomorphism of R* defined by  : | a 


Pak Then a* = —o; and so K = {ca + coy | c € R} is a proper subalgebra of 
End(R*) for which there are no nontrivial proper subspaces of R? invariant under 
every element of K. 


Algorithms for the computation of the eigenvalues and eigenvectors of a given 
matrix are usually very complicated, especially if speed of computation is a major 
consideration. Therefore, we shall not go into the description of such algorithms 
in detail. As a rule of thumb, it is best to try to compute eigenvectors directly, and 
not through finding roots of the characteristic polynomial, since small errors in the 
computation of eigenvalues may often lead to large errors in the computation of the 
corresponding eigenvectors. For matrices over R, there are often reasonably effi- 
cient iterative methods to find at least some of the eigenvectors. We will bring here 
one example to find an eigenvector associated with the real eigenvalue of a ma- 
trix over R having greatest absolute value (often called the dominant eigenvalue), 
under assumption that such an eigenvalue indeed exists. The algorithm is based 
on the observation that if c is an eigenvalue of a matrix A € M,.,(R) then ck 
is an eigenvalue of A‘. Hence, if k is sufficiently large, the matrix A(A*) is ap- 
proximately equal to cA*. Therefore, if we select an arbitrary vector v € R” and 
successively define vectors v), v®,... by setting v’+) = Av for each i > 0, 
then Av = A*+!y© and this is roughly equal to cu. So, if the circumstances 
are amenable (and we will not go into the precise conditions necessary for this to 
happen), the vector v“) is a reasonable approximation to an eigenvector of A as- 
sociated with c. Of course, we must always remember that repeated computations 
lead to accumulating roundoff and truncation errors; one way of combating these is 
to divide each entry in v by the absolute value of the largest entry, and use this 
“normalized” vector in the next iteration. 
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With kind permission of NPL (Wilkinson); with kind permis- 
sion of the Archives of the Mathematisches Forschungsinstitut 
Oberwolfach (von Mises). 

Of the many numerical analysts who studied com- 
putational methods for finding eigenvalues, one of 
the most important is the British mathematician 
James H. Wilkinson, a former assistant of Alan 
Turing and one of the major early innovators in nu- 
merical linear algebra. The iteration algorithm given here was first studied in the 1920s by 
the Austrian applied mathematician Richard von Mises, who later emigrated to the United 
States. 


After one calculates the dominant eigenvalue of a matrix in My x»(R), there 
are various techniques, known as deflation techniques, for creating a new matrix in 
M (n—1)x(n—1) QR) the eigenvalues of which are the same as all of the eigenvalues of 
the original matrix, except for the dominant eigenvalue. 


5 1 


3 4 € M2 x2(R) and let us pick yO= ii then 


Example Consider A = 


1 
Ay = E , and so we will take py) = | ; 


av = 51 _to|: and so we will take =| 3]; 
Av = =I 36: and so we will take =| _ a]: 
Av = 2 Se: and so we will take =| |: 
Av ai| i: and so we will take =| a]. 


It seems that this sequence of vectors is converging to and, indeed, one 


1 
-1 
can check that this is an eigenvector of A associated with the eigenvalue 4. 

Again, preconditioning can be used to make iterative methods for finding eigen- 


values converge more rapidly. 


Example Let n be a positive integer and let A € M,,.,(R) be a matrix of the form 
[cB +(1—c)D]", where B € My,x(R) is a Markov matrix, c € R satisfies 0 < c < 


1 di 
l,andD=J|:]|A | : | for nonnegative real numbers d; satisfying }~j_, dj = 1. 
1 dn 
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Such matrices have been called Google matrices since they are needed for the Page- 
Rank algorithm used by the internet search engine Google™ to compute an estimate 
of webpage importance for ranking search results (for these purposes, a typical value 


for c is 0.85). The value of n can be very large, often far larger than 10°. 


One can show that the eigenvalues e1,..., @, of such a matrix satisfy 1 = |e;| > 
|e2| >--- > |e,| = 0, and so the power method mentioned above can be (and is) used 
by Google to rapidly compute an eigenvector associated to e;. Stanford University 
researchers Taher Haveliwala and Sepandar Kamvar have shown that for any Google 
matrix, |e2| < c, with equality happening under conditions that hold in the case of 
those matrices arising in this particular application. Eigenvectors corresponding to 
this second eigenvalue can be used to detect and combat link spamming on the 
internet. 

One can also consider various generalizations of the eigenvalue problem. Thus, 
for example, given endomorphisms a and f of a vector space V, one can seek to find 
all scalars c such that cB — a is not monic. Problems of this sort arise naturally, for 
example, in plasma physics and in the design of control systems. Very often, such 
problems can be formulated as a matter of minimizing the largest generalized eigen- 
value of a pair of symmetric matrices. When 6 is an automorphism, as is usually the 
case, such generalized eigenvalue problems can be reduced to the usual eigenvalue 
problem for the endomorphism £~!a, but there are often reasons for not wanting 
to do so. For example, even if both w and 6 are represented with respect to a given 
basis by symmetric matrices, the matrix representing 6~!a may not be symmetric. 
Therefore, some specialized algorithms have been developed to find solutions of the 
generalized eigenvalue problem directly. 

If V has finite dimension n and the endomorphisms a and f are represented 
with respect to some basis by matrices A and B, respectively, one can look at the 
generalized characteristic polynomial |X B — A|. Problems arise, however, since the 
degree of this polynomial may be less than n, if the matrix B is singular. In fact, this 
polynomial may even be the 0-polynomial. 

A further generalization of the eigenvalue problem is the following: Given en- 
domorphisms ao,...,@, of V, find all scalars c such that the endomorphism 
0 cia; is not monic. Various techniques have been developed to handle this 
problem directly in special cases. Also, it can sometimes be reduced to the case of 
n = |. For example, finding a vector Oy 4 v € V in the kernel of c*a2 + cary +a is 


equivalent to finding a nonzero element of the form I? | in the kernel of cB; — Bo, 


where the §; are the endomorphisms of V defined by 


.|* a1 (x) + ao(y) _| x —a7(x) 
po] 3] >| ao(y) a au:[3 [oa]. 
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Exercise 738 

Let n be a positive integer and let A € Myxn(Q). Let c1,..., cn be the list of 
(not necessarily distinct) eigenvalues of A, considered as a matrix in My x»(C). 
Show that )°?_, cj and []j_, cj are rational numbers. 


Exercise 739 

Let F be a field, let n be a positive integer, and let A, BE Myxn(F). Assume 
that A and B have the same characteristic polynomial p(X) € F[X]. Is it neces- 
sarily true that p(X) is the characteristic polynomial of AB? 


Exercise 740 
Find infinitely-many matrices in. M3,.3(R), all of which have characteristic poly- 
nomial X (X — 1)(X — 2). 


Exercise 741 


3 2 
Find the characteristic polynomial of | 1 4 1 | €M3x3(R). 
2 1 


Exercise 742 
Let a, b,c € R. Find the characteristic polynomial of the matrix 


0 00a 
a 00 Db 
Ok 2 € Maxa4(R). 
0 0c O 


Exercise 743 
Let n be a positive integer. Show that every matrix A € My »(R) can be written 
as the sum of two nonsingular matrices. 


Exercise 744 

Let F = GF(3) and let n be a positive integer. Let D = [dij] € Mnxn(F) bea 
nonsingular diagonal matrix and let A € Mnxn(F). Show that 1 ¢ spec(DA) if 
and only if D — A is nonsingular. 


Exercise 745 

Let F be a field of characteristic other than 2. For each positive integer n, let 
T, be the set of all diagonal matrices in My,» (IR) the diagonal entries of which 
belong to {—1, 1}. For any A € Mx, (IR), show that there exists a matrix D € T,, 
satisfying 1 ¢ spec(DA). 
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Exercise 746 
Let n be a positive integer and let a : Mnxn(C) > C” be the function defined 
ao 


bya: Arh : |, where X” + ee a; X' is the characteristic polynomial 


an-\ 
of A. Is aw a linear transformation? 


Exercise 747 


a a—b 
Define aw € End(R*) bya: | b | | a+2b+c _ |. Find the eigenvalues of a 
c —2a+b—c 


and, for each eigenvalue, find the associated eigenspace. 


Exercise 748 

Let A is a nonempty set and let V be the collection of all subsets of A, which is 
a vector space over GF(2). Let B be a fixed subset of A and let a: V > V be 
the endomorphism defined by a: Yr YO B. Find the eigenvalues of a and, for 
each eigenvalue, find the associated eigenspace. 


Exercise 749 

Let V be a vector space over a field F and let a be an endomorphism of F'. Show 
that the one-dimensional subspaces of V invariant under a are precisely those of 
the form Rv, where v is an eigenvector of a. 


Exercise 750 


a b+c 
Define a € End(R*) by a: : a a . Find the eigenvalues of a. Do 
d 0 


there exist two-dimensional subspaces W and Y of IR‘, both invariant under a, 
such that R* = W @ Y? 


Exercise 751 

Let V be a vector space finitely generated over a field F and let a be an endo- 
morphism of V having an eigenvalue c. For any p(X) € F[X], show that p(c) is 
an eigenvalue of p(a). 


Exercise 752 

Let V be the vector space of all functions in R® which are infinitely differentiable 
and let a : V — V be the endomorphism of V defined bya: fr f”. Ifn >0 
is an integer, show that the function f : x +> sin(nx) is an eigenvector of a and 
find the associated eigenvalue. 
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Exercise 753 

Let F be a field and let V = F™. Let aw be the endomorphism of V defined by 
a(f):it>» f@+ 1) for all i € Z. Determine whether spec(@) is nonempty or 
not. 


Exercise 754 

Let V be the vector space composed of all polynomial functions from R to itself, 
let a € R, and let a be the endomorphism of V defined by a(p): xh (x — 
a)[p’ (x) + p’(a)] — 2[ p(x) — p(a)], where p’ denotes the derivative of p. Find 
the eigenvalues of @ and for each such eigenvalue, find the associated eigenspace. 


Exercise 755 
Let a be the endomorphism of M2,.2(R) defined by 


[ab ee d —b 
“le d —c al 
Find the eigenvalues of a and for each such eigenvalue, find the associated 
eigenspace. 


Exercise 756 
Let V be a vector space over Q and let a € End(V) be a projection. Show that 
spec(a) C {0, 1}. Is the converse true? 


Exercise 757 

Let V be a vector space of dimension n > 0 over a field F. Let a be an endomor- 
phism of V for which there exists a set A of n+ 1 distinct eigenvectors satisfying 
the condition that every subset of A of size n is a basis for V. Show that all of 
the eigenvectors in V are associated with the same eigenvalue c of a and that 
a=Cco}. 


Exercise 758 


For a,b e€R, let A= € M3,3(R). Find the eigenvalues of A. 


oes 
Sa 
aoc o 


Exercise 759 
Let A € M2,.2(R) be a matrix of the form b al where a > 0 and bc > 0. 


Show that A has two distinct eigenvalues in R. 


Exercise 760 


5 6 -3 
Find the eigenvalues of the matrix | —1 0 1 | € M3,;.3(R) and, for each 
22 -1 


such eigenvalue, find the associated eigenspace. 
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Exercise 761 


0 2 1 
Find the eigenvalues of the matrix | —2 0 3} €M3,3(C) and, for each 
-1 -3 0 


such eigenvalue, find the associated eigenspace. 
Exercise 762 


Let W be the subspace of R© consisting of all convergent sequences and let a 
be the endomorphism of W defined by 


a:[aj,da2,...]b [ (Jim di) — a, (lim ai) —ay,.. af 
I> Co 1—>0o 


Find all eigenvalues of a and, for each eigenvalue, find the corresponding 
eigenspace. 


Exercise 763 


1 -1 1 
Find the eigenvalues of the matrix | 1 0 O} in M3,x3(C) and, for each 
0 1 0 


such eigenvalue, find the associated eigenspace. 


Exercise 764 
Does there exist a real number a such that 


1 -l 0 
spec 0 a -l = {—2, —1, 0}? 
-6 11 -5 


Exercise 765 

Let aw be an endomorphism of a vector space V over a field F and let v and w be 
eigenvectors of a. If v-+ w ¥ Ov, show that v + w is an eigenvector of a if and 
only if both v and w correspond to the same eigenvalue. 


Exercise 766 
a 


1 0 

Show that the matrix A=|a a a | has three distinct eigenvalues for any 
a 0 -l 

real number a. 


Exercise 767 

Let n be a positive integer and let t be a nonzero real number. Let A € My xn(R) 
be the matrix all of the entries of which equal t. Find the eigenvalues of A and, 
for each such eigenvalue, find the associated eigenspace. 


Exercise 768 
Let n be a positive integer and let F be a field. Let A be a nonsingular matrix in 
Mnxn(F). Given the eigenvalues of A, find the eigenvalues of Av}. 


Exercises 287 


Exercise 769 
Let n be a positive integer and let F be a field. Let A = [aj;j] € Mnxn(F), and 
let c € spec(A). If b,d € F, show that be +d € spec(bA+d1). 


Exercise 770 
Let A= ¢ 4 € M2,2(R). If t € R is a root of the polynomial bX? + 


(a — d)X —c € R[X], show that 7 | is an eigenvector of A associated with 


the eigenvalue a + bt. 

Exercise 771 

Let A € M2x2(C) be a matrix having two distinct eigenvalues. Show that there 
are precisely four distinct matrices B € M2 2(C) satisfying B* = A. 


Exercise 772 


a 0 0O 
Find alla € R such that | 2a 2a 2a | has a unique eigenvalue. 
0 O a 


Exercise 773 
Find a real number a such that the only eigenvalue of the matrix 


a l 0 
-1 0 -!l € M3x3(R) 
0 1 -a 
is 0. 
Exercise 774 
1 
For each 1 <i <3 and 2 < j < 3, find a real number a;; such that | —1 |, 
0 
1 1 1 ay ay 
0], and | 1] are all eigenvectors of the matrix | 1 a2 a3] € 
-] 1 1 a3. 433 


M3x3(R). 


Exercise 775 

Let 0#r € C and let n and m be positive integers. Let A = [ajj] € Mnxn(C) 
be given and let B = [bjj] € Mnxn(C) be the matrix defined by b;; = pmtind qi 
for all 1 <i, 7 <n. Show that if d € C is an eigenvalue of A then rd is an 
eigenvalue of B. 


Exercise 776 
Let n be a positive integer and let F be a field. A matrix A € My xn(F) is a 
magic matrix if and only if there exists a scalar c € F such that the sum of the 


288 12 Eigenvalues and Eigenvectors 


entries in each row and each column is c. Characterize magic matrices in terms 
of their eigenvalues. 


Exercise 777 
Let A € M2,.2(C) be a matrix having distinct eigenvalues a 4 b. Show that, for 
alln > 0, 


n n 


fee ey ee | 
~a—b <<. 


Exercise 778 
Let A € Mox2(C) be a matrix having a unique eigenvalue c. Show that 
A" = c""![nA — (n — 1)c!] for all n > 0. 


Exercise 779 
Let n be a positive integer and let A € Mnxn(C). Show that every eigenvector 
of A is also an eigenvector of adj(A). 


Exercise 780 

Let n be a positive integer. Let G be the set of all matrices A € My xn(C) sat- 
isfying the condition that C” has a basis composed of eigenvectors of A. Is G 
closed under taking sums? Is it closed under taking products? 


Exercise 781 
Let p(X) € C[X] and let A € M,,.,(C) for some positive integer n. Calculate 
the determinant of the matrix p(A) using the eigenvalues of A. 


Exercise 782 


= a 
Let -14a€ Rand let A= ' one : ; “| € Mox2(R). Calculate A” 


a-—a 
for alln > 1. 


Exercise 783 
Let n be a positive integer. Given a matrix A € My xn(Q), find infinitely-many 
distinct matrices having the same eigenvalues as A. 


Exercise 784 


1 0 0 0 
: . 0 0 c O 

Let c € R. Find the spectral radius of A = 0 -- 0 0 € Max4(C). 
0 0 0 0 


Exercise 785 
Let A € M,x,(R) be a matrix all entries in which are positive and let c be a 
positive real number greater than the spectral radius of A. Show that |cJ — A| > 0. 
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Exercise 786 
Let n be a positive integer. Show that 1 is an eigenvalue of any Markov matrix in 


Mnxn(R). 


Exercise 787 
Let n be a positive integer. Let A = [aj] € Mnxn(R) satisfy the condition that 
SS |ai;| < 1 for all 1 < j <n. Show that |c| < 1 for all c € spec(A). 


Exercise 788 

Let n be a positive integer and let F be a field. Let A = [a;j] € Mnxn(F) bea 

matrix satisfying the condition that the sum of the entries in each row equals 1. 
by 

Let 17 #c € spec(A) and let | : | be an eigenvector of A associated with c. 


bn 
Show that }"",_, bj =0. 


Exercise 789 
Give an example of a matrix A € Mp2, 2(R) satisfying the condition that 
spec(A) = @ but spec(A*) ASD. 


Exercise 790 

Find an example of matrices A, B € M2,x2(R) satisfying the condition that ev- 
ery element of spec(A) U spec(B) is positive but every element of spec(A B) is 
negative. 


Exercise 791 
Find a polynomial p(X) € C[X] of degree 2 satisfying the condition that all 


l-a 
matrices in M C) of the form 
2x2¢ ) p(a) ‘| 


, for a € C, have the same char- 


acteristic polynomial. 


Exercise 792 

Let F be a field and let n be an even positive integer. Let A, B € Myxn(F) be 
matrices satisfying A = B*. Let p(X) be the characteristic polynomial of A and 
let g(X) be the characteristic polynomial of B. Show that p(X?) =q(X)q(-X). 


Exercise 793 
Let F be a field. Characterize the matrices in M»,.2(F) having the property that 
their characteristic polynomial is not equal to their minimal polynomial. 


Exercise 794 
Let (K, e) be an associative unital F'-algebra, let v € K, and leta: K > K bea 
homomorphism of F'-algebras. Show that Ann(v) C Ann(@(v)). 
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Exercise 795 


Are the matrices : | and E 


3 | in. Mo,2(R) similar? 


Exercise 796 


1 i 0 1+i 7 2 
Are the matrices | i 2 —1 | and 0 1 9 in. M3x3(C) similar? 
0 i 0 O 2-i 


Exercise 797 
Let n be a positive integer and let A, B € My x,(R). Show that if A and B 
are similar when considered as elements of My x»(C), they are also similar in 


Maxn(R). 


Exercise 798 


1 0 1 
Find a diagonal matrix in. M3,.3(R) similar to the matrix | 0 1 

1 0 1 
Exercise 799 

0 0 1 
Find a diagonal matrix in. M3,.3(R) similar to the matrix | 0 0 0 

1 0 0 


Exercise 800 
Is there a diagonal matrix in M3,.3(R) similar to the matrix 


8 2 .=3 
-6 -l 3 |? 
12 6 —4 


Exercise 801 
Show that every matrix in the subspace of M2,2(R) generated by 


0 1 1 0 
1 O|/’?|O -1 
is similar to a diagonal matrix. 


Exercise 802 


in. M3 x3(GF(5)) are 


i) 
ooo 


0 1 0 0 
Determine if the matrices |} 0 O 1 | and | 1 
0 0 0 0 


similar. 
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Exercise 803 
Is there a diagonal matrix in M3,.3(R) similar to the matrix 


i =i: 2 
Lo Gd B19 
0. ad # 


Exercise 804 


1 0 0 
Let A=]0O 1 1} € M3,3(R). Find a nonsingular matrix P € M3 3(R) 
01 1 


such that P~! AP is a diagonal matrix. 


Exercise 805 


Show that the matrix A = E i € M2 x2(C) is not similar to a diagonal 


1 
matrix. 
Exercise 806 
1 -1 0 2 0 0 
Are the matrices | 0 2 5]and|] —-1 4 Of inM3,.3(Q) similar? 
0) 0 3 03 7 


Exercise 807 
Let k and n be positive integers and let F be a field. Let A € Mxxn(F) and 


_ |AB O| ... ._|O O 
9 
Be Many xx(F). Is the matrix B 4 similar to the matrix E a ? 


Exercise 808 


Let F be a field and let A = 


coos 


1 0 
a 1 | €M3x3(F). For any p(X) € F[X], 
0 a 


pla) pila) 5p"(a) 
show that p(A) = 0 p(a)  p’(a)_|, where p’(X) denotes the formal 
0 0 p(a) 
derivative of the polynomial p(X) and p”(X) is the formal derivative of p’(X). 


Exercise 809 
Find the characteristic and minimal polynomials of the matrix 


7 4 -4 
4 -8 -1]€e€M3,3(R). 
af of =§ 
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Exercise 810 

Let n be a positive integer. Let V be the vector space over R consisting of all 
polynomial functions from R to itself having degree at most n. Let a be the en- 
domorphism of V which assigns to each f € V its derivative, and let A be a 
matrix representing a with respect to some basis of V. Find the minimal polyno- 
mial of A. 


Exercise 811 


Find six distinct matrices in M2,.2(R) which annihilate the polynomial X 2_ 4], 


Exercise 812 
Let n be a positive integer and let c be an element of a field F. Find a matrix 
AE My xn(F) having minimal polynomial (X — c)”. 


Exercise 813 


Use the Cayley—Hamilton Theorem to find the inverse of 


5 1 -l 
-—6 0 2 | €M3,3(R). 
0 0 2 


Exercise 814 
Let n be a positive integer and let A € My x_(F) be a matrix of rank h. Show 
that the degree of the minimal polynomial of A is at most h + 1. 


Exercise 815 
Let n be a positive integer and let F be a field. Show that a matrix A € Myxn(F) 
is nonsingular if and only if m4 (0) 40. 


Exercise 816 


001 1 
: ‘ : 001 0 : 
Find the eigenvalues of the matrix 010 0 € M4 x4(Q) and determine 
1 1 0 0 
the algebraic multiplicity of each. 
Exercise 817 
0 1 0 
Find the minimal polynomial of the matrix | 0 0 1} in M3x3(R). 


1 3 -3 


Exercises 293 


Exercise 818 
Let a and be the endomorphisms of Q* represented with respect to the canon- 


1 0 -1 0 4 -1 -l 0 

: : . 0 1 0 -l —1 4 0 -1 

ical basis by the matrices oo. <4 0 and I 0 > 4 [te 
0 1 0 -l 0 1 -l 2 


spectively. Does there exist a basis of Q* with respect to which both of them can 
be represented by diagonal matrices? 


Exercise 819 
Let a be the endomorphisms of R? represented with respect to the canonical 


-6 2 -5 
basis by the matrix 4 4 -—2 |. Calculate the algebraic and geometric 
10 —3 8 


multiplicities of each of the eigenvalues of a. 


Exercise 820 
Let A = [aij] € Mnxn(C) be a symmetric tridiagonal matrix having an eigen- 
value c with algebraic multiplicity k. Show that a;—_1,; = 0 for at least k — 1 
values of i. 


Exercise 821 
Let a be the endomorphisms of R? represented with respect to the canonical 


—-8 -13 —-14 
basis by the matrix | —6 —5  —8 |. Does there exist a basis of R? with 
14 17.21 


respect to which a@ can be represented by a diagonal matrix? 


Exercise 822 


Let A=] _ € Max4(Q). Find the minimal polyno- 


mial A. 


Exercise 823 

cos*(t) cos(t) sin(t) 
cos(t) sin(t) sin? (t) 
that all of these matrices have the same characteristic and minimal polynomials. 


For each ¢t € R, set A(t) = € M2x2(R). Show 


Exercise 824 

Let a, b,c € C. Find a necessary and sufficient condition for the minimal poly- 
2 0 0 

nomial of | a 2 0O | €M3x3(C) to be equal to (X — 1)(X — 2). 
bc il 
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Exercise 825 


loa 0 
LetA=]a a 1 | € M3 3(R). Find the set of all real numbers a for which 
aa —-l 


the minimal and characteristic polynomials of A are equal. 


Exercise 826 

Let F = GF(5). For which values of a,b € F are the characteristic polynomial 
ab44 2 0 
bb b 3 3 

and minimal polynomial of the 5 x 5 matrix | 3 4 2b 1 3 | equal? What 
00 0 0 1 
0 0 0 3b 0 


if F = GF(7)? 


Exercise 827 
Let F be a field and let O #4 A € M3,3(F) be a matrix satisfying Ak = O for 
some positive integer k. Show that A? = O. 


Exercise 828 

Let F be a field and let A € M3 3(F) be a matrix which can be written in the 
form BC, where B and C are involutory matrices in M3,.3(F). Show that A is 
nonsingular and similar to A~!. 


Exercise 829 
Let A € Mnyn(F) be written in the form A = PB, where P, B € Myyxn(F) and 
P is nonsingular. Show that A is similar to BP. 


Exercise 830 
Let A € M3,.3(Q) be a matrix satisfying the condition that A> = I. Show that 
A=lI. 


Exercise 831 

Let n be a positive integer and let w be an endomorphism of Myx» (C), consid- 
ered as a vector space over C, which satisfies the condition that @(A) is nonsin- 
gular if and only if A is nonsingular. Show that w is an automorphism. 


Exercise 832 

Let F be a field and let n be a positive integer. Let A € My xn(F) be a matrix 
having characteristic polynomial p(X) = X” + =, c;X'. Show that, for each 
k>n, we have Ak = ae, b;(k)A/, where 

(1) bj(2)=—c; forallO< j<n—1,; 

(2) b_1(k) = 0 for all k > 7; 

(3) bj (K+ 1) =bj-1(k) — ajby_1(&) for all k =n andallO< j <n—1. 
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Exercise 833 
Show that there is no matrix A € M2 2(R) satisfying the condition that 
-1 0 
Dice 
Ar = 0 where o 1. 


Exercise 834 

Let V be a vector space finitely generated over C and let a € End(V) be diago- 
nalizable. If W is a nontrivial subspace of V invariant under q, is the restriction 
of a to W necessarily diagonalizable? 


Exercise 835 

Let V be a vector space finitely generated over a field F and let w € End(V). 
Show that @ is diagonalizable if and only if the sum of all of its eigenspaces 
equals V. 


Exercise 836 

Find all rational numbers a satisfying the condition the endomorphism of Q? 
1 0 0 

represented with respect to some basis by the matrix | 1 a 0 | is diagonaliz- 
0 0 1 

able. 


Exercise 837 
Give an example of an endomorphism a of R? having nullity 2 which is not 
diagonalizable. 


Exercise 838 

Let n be a positive integer and let B € My xn (R) be a matrix all entries of which 
are positive. Let r > o(B). Show that 

(1) The matrix A =r/J — B is nonsingular; 

(2) All nondiagonal entries of A are nonpositive; 

(3) All entries of A~! are nonnegative; and 

(4) Ifa+ bi € C is an eigenvalue of A, then a > 0. 


Exercise 839 

Let V be a vector space of finite odd dimension over R and let a1,..., a, be 
distinct mutually-commuting endomorphisms of V, for some k > 1. Show that 
these endomorphisms have a common eigenvector. 


Exercise 840 
Let A € M3,.3(Q) have characteristic polynomial X? — bX? + cX — d. For all 
n > 3, show that 


A” = t_1A + tn_2 adj(A) + (tr — bty-1)1, 


where t, = Vaal 
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Exercise 841 


a 2b a Cc 
: |b 2a |b d 4 
Show that the endomorphisms a : al hea and Bp: el? aa of Q 
d 2c d b 


are diagonalizable and commute. Find a basis of Q¢ relative to which both a and 
B are represented by diagonal matrices. 


Exercise 842 
Let A = [ajj] © Mn xn(R) be a Markov matrix all entries of which are positive. 
If c € C is an eigenvalue of A satisfying |c| = 1, show that c = 1. 


Exercise 843 
Let F be a finite field. Show that there exists a symmetric matrix in M2,2(F) 
having no eigenvalues. 


Exercise 844 
Does there exist a square matrix A over R which is not idempotent but satisfies 
the condition that spec(A) = {1}? 


Exercise 845 

Let n be a positive integer and let A € My xn(Q) be a matrix all entries of which 
are integers. Let k be an integer which is an eigenvalue of A. Show |A| is and 
integer and that k divides | A|. 


Exercise 846 

Let V be a vector space of dimension 3 over a field F and let a € End(V) have 
nullity 2. Show that the characteristic polynomial of a is of the form X*(X — c) 
for some c € F. 


Krylov Subspaces 1 3 


Let V be a vector space over a field F and let a € End(V). If Ov 4 vo € V then the 
subspace F{vp, a(v9), a?(v9), ...} of V is called the Krylov subspace of V defined 
by a@ and uo. The elements of this subspace are precisely those vectors in V of the 
form p(a)(vo), where p(X) € F[X], and so it is natural to denote it by F[a]vo. It is 
clear that F[@]vo is invariant under a. 


Alexei Nikolaevich Krylov was a Russian applied mathematician who 
at the end of the nineteenth century developed many of the methods 
mentioned here in connection with the solution of differential equa- 
tions. 


Proposition 13.1 Let V be a vector space over a field F, let w € End(V), 

and letOy AvuvEV. 

(1) Fle@]vo is the intersection of all subspaces of V containing vo and invari- 
ant under a; 

(2) vo is an eigenvector of a if and only if dim(F [a]vo) = 1. 


Proof (1) Since F[a]vg contains vg and invariant under q, it certainly contains the 
intersection of all such subspaces of V. Conversely, if W is a subspace of V which 
contains vo and invariant under a, then p(a)(v9) € W for all p(X) € F[X] and so 
F[a]uo © W. Thus we have the desired equality. 

(2) If vo is an eigenvector of a associated with an eigenvalue c then for each 
D(X)= YS aj;X/ € F[X] we have p(a@)(v9) = ae. ajas (v9) = ae ajc/ vo 
€ Fug, proving that F[a]vp = Fvo and so dim(F[a]vp) = 1. Conversely, assume 
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that dim(F[a@]vp) = 1. Then F[a]vg = Fug since F vo is a one-dimensional sub- 
space of F[a]vo. In particular, w(vg) € Fup and so there exists a scalar c such that 
(v9) = cug, which proves that vo is an eigenvector of a. 


Since the set {v9, a(vg), @7 (v9), ..-} is a generating set for F[a]vo over F, 
Proposition 13.1(2) suggests that dim(F'[a@]vg) can be used to measure how far vp 
is from being an eigenvector of a. 

As a first example of the use to which we can put Krylov subspaces, we 
will see how to use the minimal polynomial to solve systems of linear equa- 
tions. Let V be a vector space over a field F and let V® be the space of 
all infinite sequences of elements of V. Every polynomial p(X) = a ajX/ 
€ F[X] defines an endomorphism 6, of V© by @p : [vo, v1, ...] baer ajvj, 
ae, ajUj4i, er ajvj42,...]. Note that if p(X) =c is a polynomial of degree 
no greater than 0, then 0, = o¢. It is also easy to verify that 0pg = 0,0, = 049) for 
all p(X), g(X) € F[X]. 

A sequence y € V® is linearly recurrent if and only if there exists a polynomial 
p(X) € F[X] with y € ker(6,). In this case, we say that p(X) is a characteristic 
polynomial of y. If p(X) € F[X] is a characteristic polynomial of y € V© and if 
q(X) € F[X] is a characteristic polynomial of z € V° then @pq(y + Z) = 649 p(y) + 
O04 (z) = [0, 0, ...] and so p(X)q(X) is a characteristic polynomial of y + z. It is 
also clear that p(X) is a characteristic polynomial of cy for all c € F. Thus we 
see that the set of all linearly recurrent sequences in V™ is a subspace of V™, 
which we will denote by LR(V). If y ¢ LR(V), there is precisely one characteristic 
polynomial which is monic and of minimal degree. This polynomial will be called 
the minimal polynomial of y. The degree of the minimal polynomial of y is the 
order of recurrence of y. 


Linearly recurrent sequences in R® were considered by the 
seventeenth-century French-born mathematician Abraham de 
Moivre, who spent most of his life in exile in England and was one of 
the fathers of the theory of probability. 


Example Let F be a field and let n be a positive integer. If A € Mnxn(F), a poly- 
nomial p(X) € F[X] is a characteristic (resp., minimal) polynomial of the sequence 
[1, A, A?,...] if and only if it is the characteristic (resp., minimal) polynomial of the 
matrix A. 


Example Let V = F = Q and let y = [ao,a1,...] € V™ be the sequence defined 
by agp = 0, ay = 1, and aj42 = aj41 + a; for all i > 0. This sequence is called 


13. Krylov Subspaces 299 


the Fibonacci sequence. Its minimal polynomial is X* — X — 1. The roots of this 
polynomial are 5(1 + 5). The number 5(1 + 5) is called the golden ratio and 
artists consider rectangles the sides of which are related by the golden ratio to be of 
high aesthetic value. This ratio—which appears in ancient Egyptian and Babylonian 
texts—appears in nature and is basic in the analysis of certain patterns of growth in 
nature (such as the spirals of a snail shell or a sunflower), of Greek architecture, of 
Renaissance painting, and even such modern designs as the ratio of the dimensions 
of a credit card or of A4 paper. Notice that X* — X — 1 is also the characteristic 


polynomial of the matrix k | € M2 x2(R), and so the eigenvalues of this ma- 


trix are also precisely 5(1 + /5). The eigenspace associated with 5(1 + ¥/5) is 
1 1 
R zt a ail and the eigenspace associated with 5(1 — /5) is R 21 44, saa 


Leonardo Fibonacci was born in Italy in the 
twelfth century and educated in Tunis, bringing 
back the fruits of Arab mathematics to Europe. 
His book Liber Abaci, written in 1202, contained 
the first new mathematical research in Christian 
Europe in over 1000 years. In 1509, Fra Luca 
Pacioli, one of the most important Renaissance 
mathematicians, wrote a book, The Divine Pro- 
portion, illustrated by his friend Leonardo da 
Vinci, about the golden ratio. 


We note that if V = F and if y e LR(F) is a sequence having order of recur- 
rence at most n, then there exist algorithms, which are essentially extensions of the 
Euclidean algorithm, to calculate the coefficients of the minimal polynomial of y in 
an order of n* arithmetic operations in F. 

Now let V be a vector space of finite dimension n over a field F and let a 
be an automorphism of V having minimal polynomial p(X) € F[X]. If we V 
then the sequence y = [w,a(w), a2(w),...] belongs to ker(6,) and hence to 
LR(V). Therefore, this sequence has a minimal polynomial g(X) = ie, cjX j, 
which divides the polynomial p(X) in F[X]. Since @ is an automorphism, we 
can assume that co 4 0 and so we see that if u = -¢5' pe cja/—!(w) then 
a(u) = w and so u=a!(w). In particular, if V = F” for some positive integer 
n and if aw is represented by a matrix A with respect to the canonical basis, then 
u=—Co ’ yi cj Al ~!w is the unique solution of the system of linear equations 
AX = w. If we set g*(X) = -c' i cjXI7! then u = q*(A)w, and this could 
be computed quickly were we to already know q(X). 

How does one calculate g*(X) in practice? One method used is basically proba- 
bilistic: we randomly choose a vector u € F” and compute the minimal polynomial 


300 13 Krylov Subspaces 


qu(X) of yy = [uO w,u© (Aw),uO A*w),...]¢€ F™, something which can be 
done, as we have already observed, in an order of n* arithmetic operations in F. 
After that, we check whether the minimal polynomial of y, is also the minimal 
polynomial of y. In general, it will not be so, but it will divide the minimal polyno- 
mial of y and so after a reasonable number of such attempts we will, usually, have 
enough information on hand to reconstruct the minimal polynomial of y. 


14 4 3 
Example Let F = GF(5), lett A=] 4 0 3 | €M3,3(F), andletw=]1]e 
12 4 2 


F?. The sequence w, Aw, Aw, ... looks like 


mw 
ee 
= 
EIS 
Te 
== 


If we choose u = ; we obtain the sequence y, = [3,0,4,2,3,0,...] in F™ 
0 
and the minimal polynomial g,,(X) of this sequence equals X* + 2X + 2. Since 
qu(A)w = ; , we see that this polynomial is not the minimal polynomial of y. 
: 1 
We will try again with u = | 2 |. For this choice, we get y, = [0,1,2,2,3,2,...] 
0 


and this has minimal polynomial X* + 3X + 1. Since the minimal polynomial of y 

has to be a multiple of this polynomial, and has to be of degree 3, it must equal X* + 
0 

3X + 1 and, indeed, g,(A)w = | 0 |. Therefore, g*(X) = X? +3 and q*(A)w = 
0 


qu(A)w = 


me Wh 


Now let us return to an important problem which was considered in the previous 
chapter. Let V be a vector space finitely generated over a field F. Given an endomotr- 
phism a € End(V), how can we find a basis of V relative to which a is represented 
by a matrix which is as nice as possible? We have already found out when a@ is 
diagonalizable. But what if w is not diagonalizable? Given a vector Oy 4 w € V, 
there exists a positive integer k such that the set {w,a(w),..., a*—!(w)} is lin- 
early independent but the set {w,a(w),.. .,a*(w)} is linearly dependent. Then 
{w,a(w),. ..,a* 1! (w)} is a basis for the Krylov subspace F[a]w of V, which 
is called the canonical basis of this subspace. The restriction of a to F[a]w is 
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00... 00 «¢ 
1 .. O 0 © 
1... O O C3 
represented by a matrix of the form |... 4 . with respect to 
0 0... 1 0 Cr-1 
00... O 1 cE 
the canonical basis, where the scalars c),..., cx, satisfy ak (w) = a cja'—!(w). 


This, of course, is just the companion matrix of the polynomial X* — sa ox. 

Krylov subspaces are also the basis for a family of non-stationary iterative algo- 
rithms, known as Krylov algorithms, used for approximating solutions to systems of 
equations of the form AX = w, where A € M,,y,(R) or A € Myyn(C). Similarly, 
Krylov subspaces are a basis for a family of non-stationary iterative algorithms, 
known as Lanczos algorithms, used for approximating eigenvalues of sufficiently- 
nice (e.g., symmetric) matrices. Such algorithms work even under the assumption 
that we don’t even have direct access to the entries of A but do have a “black box” 
ability to compute Av or A’ v for any given vector v € R”. Of course, they do 
not work for all matrices, but when they work they tend to be fairly efficient and 
rapid, and are especially good for large sparse matrices. Moreover, they are also 
amenable to implementation on parallel computers. Parallel Lanczos algorithms 
have also been developed for solving generalized eigenvalue problems, if the matri- 
ces involved are symmetric, Lanczos algorithms can be adapted to work for matrices 
over finite fields. However, in this case there are also other algorithms available. In 
particular, one should mention the Wiedemann algorithm to solve systems of linear 
equations of the form AX = w, where A is a large nonsingular matrix over a finite 
field. Such problems arise in the computation of discrete logarithms and in other 
modes of attack on various encryption methods for transmission of data over the in- 
ternet. They have also been used to factor large integers. The Wiedemann algorithm, 


a 
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which is based on computing the minimal polynomial of a certain linearly-recurrent 
sequence, works especially well for sparse matrices, and is amenable to parallel 
computation. 

A nonzero element a of an associative algebra (K, e) is nilpotent if and only if 
there exists a positive integer k satisfying a‘ = 0x. The smallest such integer k, if 
one exists, is called the index of nilpotence of a. In particular, if V is a vector space 
over a field F, then a € End(V) is nilpotent if and only if there exists a positive 
integer k satisfying a* = og. 


Example Let F be a field and let aw be the endomorphism of F? defined by 


a 0 
a:|b|+t+]a |. Then a is a nilpotent endomorphism, having index of nilpo- 
c b 
a —a+2b+c 
tence 3. The endomorphism f of F? defined by B: | b | 0 isa 
c —a+2b+c 


nilpotent endomorphism, having index of nilpotence 2. 


Example Let F be a field and let a and B be the endomorphisms of F? defined 
bya: H b> H and 6: H a Bi Both endomorphisms are nilpotent, but 
a+ is clearly not nilpotent. 

If @ is a nilpotent endomorphism of a vector space V and w € V ~ ker(q@) then 


the restriction of a to F[a]w is represented with respect to the canonical basis of 
F[a]w by a matrix of the form 


000... 0 0 
100... 0 0 
0 10... 0 0 
0 0 0 0 0 
0 0 0 1 0 


Proposition 13.2 Let V be a vector space over a field F and let a be a 
nilpotent endomorphism of V having index of nilpotence k. Then there exists 
a vector w € V satisfying the condition that dim(F[a]w) =k. 


Proof We know that a* = oo but not that there exists a vector Oy # w € V such 
that a*—!(w) 4 Oy. We will have proven the theorem should we are able to show 
that the set {w,a(w),..., ak—l(w)} is linearly independent. And, indeed, assume 
that we have scalars ao,..., a_1 € F satisfying = a;a' (w) = Oy. Let t be the 
smallest index such that a, 4 0. Then if we apply the endomorphism a*—'—! to 
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+o ajc (w) we get Oy = a,a*—!(w) + ayy,a*(w) +--+ + ag_2a?*-*-? (w) and 
so a; = 0, which is a contradiction. Therefore, we conclude that a; = 0 for all i, and 
so the set is linearly independent, as required. 


In particular, we see that if V is a vector space of finite dimension over a field F 
and if @ is a nilpotent endomorphism of V, then the index of nilpotence of a is no 
greater than dim(V). 


Proposition 13.3 Let V be a vector space finitely generated over a field F 
and let a be a nilpotent endomorphism of V having index of nilpotence k. If 
w € V satisfies the condition that dim(F [a]w) = k then the subspace F(a]w 
of V has a complement in V which is invariant under a. 


Proof We will proceed by induction on k. If k = 1 then aw = o9 and so F[a]w = 
Fw. Then there is a subset B of V \ {Fw} such that B U{w} is a basis for V, and B 
is a basis for a complement of Fw in V. Assume that k > 1| and that the result has 
been established for any vector space finitely generated over F and any nilpotent 
endomorphism of that space having index of nilpotence less than k. 

We know that im(q@) is invariant under a and that the restriction of @ to 
im(@) is nilpotent, having index of nilpotence k — 1. We know that the set 
{w,a(w),..., ak—l(w)} forms a basis for F[a]w and so the set {a(w),..., 
a*—!(w)} forms a basis for the image U of F[a]w under a. Therefore, U = 
F[a]a(w) is a subspace of im(a) and, by the induction hypothesis, it has a comple- 
ment W) in im(q@) invariant under a. 

Let Wo = {uv € V | a(v) € Wo}. This is a subspace of V containing W2, since W2 
is invariant under a. But a(v) € W2 C Wo for all v € Wo and so Wo is also invariant 
under a. Our first assertion is that V = F[a]w + Wo. And, indeed, if x € V then 
a(x) €im(a) = U ® W2 and so a(x) =u + w2, where u € U and w2 € Wo. But 
u =a(y) for some y € F[a]w and x = y+ (x — y). The first summand belongs to 
F[a]w, whereas, as to the second, we have a(x — y) =a(x) —a(y) =a(x) —u= 
w2 € W2 and so x — y € Wo, proving the assertion. 

Our second assertion is that F[a]wM Wo C U. Indeed, if x € Fla]wN Wo 
then a(x) € UN W2 = {Ov} and so x € ker(@). nines x € F[a]w, we know that 
there exist scalars do,...,da,%—, such that x = = Ei =0 aia! (w) and hence Oy = 
a(x) = poe =) ai a't!(w), which implies that ay = = ag_—2 = 0. Therefore, x = 
ay—jak— vei € U, proving the second assertion. 

In particular, from what we have seen. we deduce that the subspaces W2 and 
Fla]w M Wo are disjoint. Therefore, W2 @ (F[a]w M Wo) is a subspace of Wo. 
This subspace has a complement W, in Wo. Thus we have Wo = Wi ®@ W2 ® 
(Fla]w nM Wo). 

Our third assertion is that W = W; ® W2 is acomplement of F[a]w in V which 
is invariant under @, and should we prove this, we will have proven the proposition. 
Indeed, we immediately note that a(W) C a(Wo) C W2 C W and so W is surely 
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invariant under a. Moreover, F[a]wM W = {Ov} since this subspace is contained 
in the intersection of W and F[a]w M Wo, which, by the choice of W, equals {Oy}. 
Finally, 


V = Flalw + Wo = Flalw + [|W + W2 + (Flalwn Wo) | 
= Flajw+Ww,+W2=Flalw+ W, 


and so V = F[a]w ® W. 


Proposition 13.4 (Rational Decomposition Theorem) Let V be a vector 
space of finite dimension n over a field F let a be a nilpotent endomorphism 
of V having index of nilpotence k. Then there exist natural numbers k = k, = 


+++ > ky satisfying kj +--+ +k; =n, and there exist vectors v1,...,v; in V 
such that {v1, a(v1), o88h5 aki-l(y1), v2, a(v2), #8. 059 ak2-l(y9), sees Us a(v;), 
.., a! (y,)} forms a basis for V. The matrix which represents a with re- 
A; O. ... O 
A2 ... O 
spect to this basis is of the form - 2 , Where each A; is of 
O O.... A; 
0 0... 0 0 
10... 0 0 
the form O 1... 0 0} in Mi; xk FP). 
0 0 1 0 


Proof Choose k, = k and choose v; ¢ ker(a“—!), Then U; = F[a]v, has a basis 
{v,,a(v1),...,*1—!(vq)}. It is invariant under w and of dimension ky. By Propo- 
sition 13.3, V = U; ® Wi, where W, is a subspace of V invariant under a. The 
restriction of a to W, is a nilpotent endomorphism of W, with index of nilpotence 
ka < k,. We now repeat the above procedure for W). Pick v2 € Wi \ ker(a®27!), 
Then U2 = F[a]v2 of W, has a basis {v2, a(v2),..., a2! (y)}. It is invariant un- 
der a and of dimension kz. Moreover, we can write Wy = U2 ® W2, where W2 
is invariant under w. Continuing in this manner, we end up with a decomposition 
V=U,@®--:®U,;,, where each U; is a subspace of V invariant under a having a 
basis of the form {v;, a(v;),..., aki-! (v;)} as above. This proves the first contention 
of the proposition. The second one follows since U; = F[a]v; for all i, which leads 
to a matrix of the desired form. 


A matrix of the form given in Proposition 13.4 is called a representation of the 
nilpotent endomorphism @ in Jordan canonical form. Let V be a vector space over 
a field F and let a be an endomorphism of V having an eigenvalue c. A vector 
Oy Ave V is a generalized eigenvector of a associated with c of degree k > 0 
if and only if v is in ker((a — co,)*) \ ker((w — coy)*—!). Thus, in particular, the 
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eigenvectors of a associated with c, in the previous sense, are just the generalized 
eigenvectors of a of degree | associated with c. 


The nineteenth-century French mathematician Camille Jordan made 
major contributions to linear algebra, group theory, the theory of finite 
fields, and the beginnings of topology. 


Example Let a be the endomorphism of R* represented with respect to the canon- 


2-2 11 
ical basis by the matrix 0 : ; . This endomorphism has an eigenvector 
0 0 0 2 
2 1 
' associated with the eigenvalue | and an eigenvector ; associated with the 
0 0 
0 
eigenvalue 2. It also has a generalized eigenvector = of degree 2 associated 
0 
0 
with the eigenvalue 2 and a generalized eigenvector E of degree 3 associated 
-1 


with the eigenvalue 2. 


We now prove a generalization of Proposition 12.1. 


Proposition 13.5 Let V be a vector space over a field F and let a € End(V) 
have an eigenvalue c. Then the set of all generalized eigenvectors of a (of all 
degrees) associated with c, together with Oy, forms a subspace of V which is 
invariant under any endomorphism of V which commutes with a. 


Proof Let a € F and let v,w € V be generalized eigenvectors of a associated 
with c, of degrees k and h, respectively. Then both v and w belong to ker(@ — 
co,)"t* and hence the same is true for v + w and av. This means that there exist 
positive integers s, t <-+k such that v + w € ker((@ — co1)*) \ ker((a — coy)°~!) 
and av € ker((a — co)") \ ker((a — coy)'~!), proving that we have a subspace. 
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If 6 is an endomorphism of V which commutes with a and if v is a gener- 
alized eigenvector of @ associated with c such that v € ker((@ — co,)*), then 
(a — coy) B(v) = B(a- co1)*(v) = Oy so B(v) is also a generalized eigenvector 
of a associated with c, proving invariance. 


Let V be a vector space over a field F and let a € End(V) have an eigenvalue c. 
The subspace of V defined in Proposition 13.5 is called the generalized eigenspace 
of @ associated with c. 


Proposition 13.6 Let V be a vector space over a field F and let a € End(V) 
have an eigenvalue c. Let v be a generalized eigenvector of degree k associ- 
ated with c. Then the set of vectors {v, (@ — co1)(v),...,(@— co,)*7} (v)} is 
linearly independent. 


Proof Set B = a — co, and, for each 1 < j <k, let vj = Bk-J(v). Assume 
that there exist scalars cj,...,cx € F satisfying aan cjvj = Oy. Then Oy = 
B'S _1 cjuj) = BE! (cern) = ceB*! (vx) and so, since BX (vg) 4 Ov, we 
conclude that cy = 0. We work backwards in this manner to see that c; = 0 for all 
1 < j <k, and so the given set is linearly independent. 


In particular, let V be a vector space of finite dimension n over a field F and 
let a € End(V). If v is a generalized eigenvector of a of degree k associated to an 
eigenvalue c of a, then we must have k <n. Thus we see that dim(V) is an upper 
bound to the degree of generalized eigenvalue of a and we see that the generalized 
eigenspace of a associated to an eigenvalue c is just ker((@ — co )"). 


Proposition 13.7 Let V be a vector space of finite dimensionn over a field 

F and let a € End(V) satisfy the condition that the characteristic polyno- 

mial p(X) of a is completely reducible, say p(X) = = (X — c;)"/, where 

spec(a@) = {c],..., Cm}. Then there exist subspaces Uj, ..., Um of V, each of 

which invariant under a, such that: 

() V=U,8---@Un; 

(2) dim(U;,) = np for each 1 <h<m; 

(3) For each 1 <h <m, the restriction of « to Uj, is of the form cntn + Bn, 
where By € End(U;) is nilpotent and tp, is the restriction of 0, to Up. 


Proof For each 1 < h <™m, consider the endomorphism 6, = a — cyo, of V, and 
let U;, be the generalized eigenspace of aw associated with c,. Then Up, is a subspace 
of V invariant under f;, and also invariant under @ since for all v € Up, we have 
By a(v) = aB; (v) = a(Oy) = Oy. We claim that there exists a positive integer k, 
independent of h, such that all elements of U;, are generalized eigenvectors of a of 
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degree at most k. Indeed, we see that ker(6;,) € ker(B?) G ker(B?) C.--- and since 
V is finitely-generated, there are at most a finite number of proper containments. 
Thus there exists a k such that ker(B*) — ker(Bkt!) =.---. From here it is clear that 
ker(ph ) = Uh, proving the claim. 

In particular, this claim shows that the restriction of By, to Up is a nilpotent endo- 
morphism having index of nilpotence k. More than that, the restriction of a to U;, 
equals cpt; + Bn, proving (3). We now notice that if t 4 h then U; is invariant under 
Bn. We claim that the restriction of 6, to U; is an automorphism. Since U; is finite- 
dimensional, it is sufficient to prove that it is a monomorphism. Indeed, suppose that 
v € U, Mker(;,). Then there exists a positive integer k such that BE(v) = Oy and so 
Ov = pk (v) = [Ba + (cn — cr) (v) = (ch — cr)*(v) and, since cp — c; + 0, we must 
have v = Oy, proving the claim. 

The next step is to show that the collection {U),..., Um} of subspaces of V is 
independent. Indeed, let 1 < h <m and let Y =U, ae U;. Then Y is a sub- 
space of V invariant under 6, on which B, is monic (since Y C Lee U;) and 
nilpotent (since Y C U;,), which is possible only if Y = {Oy}. This proves inde- 
pendence, and we will set U = U; ®--- © Um. We want to show that U = V. 
Let v € U. By the Cayley—Hamilton Theorem (Proposition 12.16), we see that 
a annihilates its characteristic polynomial p(X) and so [[]/L, B;'](v) = Oy €U. 
Suppose that 6;''(v) € U, say that it is equal to )°”, u;, where uy € Uy, for all 
1<h<~m. Since Be is epic when restricted to U;, for each h ~ 1, we can 
find an elements wy, of U;, for each 1 < h <™m, such that up, = B;'(wp) for all 
such h. Therefore, B;''(u — )-y"_, wn) = u1 € Uj. By definition of Uj, it follows 
that w) =u — “7, wn € U}. Therefore, v = )7/'_, wn € U. If, on the other hand, 
B;''(v) ¢ U, then let t be the smallest element of {2, ..., m} satisfying the condition 
that [[]}_, B;'\(v) € U and (i; B;'1(v) ¢ U. A similar argument to the preced- 
ing then shows that we must have v € U. 

We are left to show that dim(U;,) = ny for all 1 < h < m. Pick a basis for V 
which is a union of bases of the U;,. With respect to this basis, the endomorphism 


A; O.... O 
A2 ... O 

a is represented by a matrix of the form Bf , where each Ap 
O O.... Am 


is a matrix representing the restriction of a to U,. By Proposition 11.12, the char- 
acteristic polynomial of a is therefore of the form |X/J — A| = []j,_, |XJ — Aal. 
From this decomposition and from the fact that each 6, restricts to an automor- 
phism of U; for all t #h, it follows that the only eigenvalue of the restriction of 
a to Up is cp, and the algebraic multiplicity of this eigenvalue is at most ny. Since 
1 dim(Un) = 77, nn, it then follows that dim(U;) = np for each h. 


Proposition 13.7 shows that when conditions are right—for example, when the 
field F is algebraically closed—and when we are given an endomorphism @ of a 
finite-dimensional vector space V, it is possible to choose a basis for V relative to 
which @ is represented in a particularly simple form. We do this in two steps. 
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(1) Write V as a direct sum U; @--- @ U,, as above. By choosing a basis for V 
which is a union of bases of the U,;, we get a matrix representing @ composed 
of blocks strung out along the main diagonal, each representing the restriction 
of @ to one of the subspaces Up. 

(I) For each h, we have a = cpaty + Bn, where By is a nilpotent endomorphism 
of U;. We now choose a basis of Up; relative to which By, is represented in 
Jordan canonical form. 


Thus, in the end, we have a representation of a by a matrix of the form 
A; O.... O 


O A. ... O 
: . : , where each block A, is a matrix with blocks of the form 
O O.... Am 
Ch 0 0 os 0 
1 ch O 0) 
1 ch 9 | on its diagonal (these may be | x 1!) and all other en- 
0 oO... 1 ce 


tries equal to 0. A matrix of this form is called the Jordan canonical form of a. By 
Proposition 13.7, we see that if V is a vector space finitely generated over a field 
F and if a is an endomorphism of V having a completely reducible characteristic 
polynomial in F[X], then there is a basis of V relative to which a can be represented 
by a matrix in Jordan canonical form. Thus, this can always be done if the field F 
is algebraically closed. If F is not algebraically closed then it is always possible to 
extend the field F to a larger field K such that the characteristic polynomial of @ is 
completely reducible in K[X]. 


Example Consider w € End(R*) represented with respect to the canonical basis 


0 1 0 0 
0 0 1 0 a . eer 3 
by A= 0 0 0 41t The characteristic polynomial of A is X* — 4X° + 
-1 4 -6 4 
6X* — 4X + 1 = (X — 1)4 and so its only eigenvalue is 1. Then A is similar 
1 0 0 0 
1 1 0 0]. ; ay 
to B= 0110 in Jordan canonical form. Indeed, B = PAP, where 
001 1 
—] = 
P= 


3 
1 -2 
-1 1 
0 


oor W 
ooor 
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Example Consider « € End(R*) represented with respect to the canonical basis 
3 00 OO 0 


0 4 2 -1 4 
by A=|]0 0 2 0 0]. The characteristic polynomial of A is (X — 3)?- 
0 1 3 2 1 
0 0 0 0 2 
20 0 0 0 
0 3 0 0 0 
(X — 2)? and Ais similartoB=|0 1 3 O 0 | in Jordan canonical form. In- 
0 0 0 2 0 
00 0 0 3 
deed, 
0 0 -3 0 —4 
0 1 -1 -i 3 
B=PAP"'!,whereP=| 2 1 3 0 1 
0 0 0 0 -1l 
-1 0 0 0) 0 


Example Consider a € End(C*) which is represented with respect to the canon- 


0 0 2 O 
ical basis by A = : ne . The characteristic polynomial of A 
—2i 0 0 O 
0 -2i 1 O 
—2 0 0 0 
is (X —2)?(X + 2)? and A is similar to B = Be OY gesd 
0 0 2 0 : 
0 oO 1 2 
10-i O 
B= PAP-', where P = 5 ; : : 4 
01 0 3 


We now use Jordan canonical forms to prove a result interesting in its own right. 


Proposition 13.8 Letn be a positive integer and let A € Mnxn(F), where F 
is an algebraically-closed field. Then A can be written as a product of two 
symmetric matrices. 


Proof By Proposition 13.7, we know that A is similar to a matrix B in Jordan 
canonical form. In other words, there exists a nonsingular matrix Q € Myxn(F) 
satisfying A = QBQ7'. If we can write B = CD, where both C and D are symmet- 
ric, then A= QBDQ™! = (QCQ")((Q")"'DQ™') = (QCQ")((Q"')" DQ"), 
where both QC Q’ and (Q~!)7 DQ™! are symmetric. Therefore, without loss of 
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generality, we can assume that A is in Jordan canonical form, say 


A; O.... O 

O Ad ... O 
A=/].. Ns 

O O Am 


where each block Aj, is of the form 


a, 0 O.... O 

1 a, O. ... O 

0 1 ah apeiia 0 E€ Mn, xn, (F). 
0 oO ... 1 a@ 


Define the matrix Dy € Mn, xn, (F) to be [d;;], where 


aa) fit jem 
~~ 10 otherwise. 


Then Dp is a symmetric matrix satisfying D, Me Dy,. Moreover, the matrix 


D,; O.... O 
O Do ... O 

D= : : a : €EMaixn(F) 
O O.... Dn 


is also symmetric and satisfies D7! = D. Furthermore, the matrix 


A,iD, O O 
O A2D2 ... O 
C= : : : €Maixn(F) 
O O ws AmDmn 


is also symmetric and A = CD, as required. 


Another interesting result is the following. 


Proposition 13.9 If A € Mnxn(F), where F is an algebraically-closed field, 


then A is similar to its transpose. 


Proof Since F is algebraically closed, we know that A is similar to a matrix B in 
Jordan canonical form, and to show that B is similar to its transpose it suffices to 
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show that each Jordan block is similar to its transpose. This is surely true of blocks 


c 0 O0O.... O 
lc O.... O 

of size 1 x 1. If C= 0 loc... 0 is a Jordan block of size h x h for 
00... 1 e¢ 


h > 1, then we note that PCP = C’, where P =[ pij| is the involutory matrix 
defined by the condition that 


fl ifitjan4l, 
Pii =) otherwise 


thus proving the result. 


Contemporary American mathematician Richard Brualdi and Chinese mathe- 
maticians Pei Pei and Xingzhi Zhan have shown that the Jordan canonical form 
of a matrix in M,,x.,(C) is the best one can get in terms of sparseness, namely they 
proved that among all the matrices that are similar to a given matrix in My x»(C), 
the Jordan canonical form has the greatest number of off-diagonal zero entries. 


Exercises 


Exercise 847 
Find endomorphisms a and f of R? satisfying the condition that wf is not nilpo- 
tent but ca + df is nilpotent for all c,d € R. 


Exercise 848 
Let V be a vector space over a field F and let a € End(V) be nilpotent, having 
index of nilpotence k > 0. Show that 0; +a@ € Aut(V). 


Exercise 849 
Let V be a vector space finitely-generated over C. Do there exist endomorphisms 
a and £ of V satisfying the condition that o; + af — Ba is nilpotent? 


Exercise 850 
Let F a field and let B be a given basis of F?. Let a € F and let w be the 


-a aoa 
endomorphism of F? satisfying ®gz (a) = 0 O 0 |. For which values of 
-—a aa 


a is this endomorphism nilpotent? 


Exercise 851 
Let F be a field. Give an example of a nilpotent endomorphism of F> having 
index of nilpotence 3. 
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Exercise 852 

Let a be the endomorphism of V = R* represented with respect to a basis B of 
2 -8 12 —60 
2 -5 9 —48 
6 -17 29 —152 
1 -3 5 —26 
index of nilpotence. 


V by the matrix . Show that @ is nilpotent and find its 


Exercise 853 
Let V be a vector space over a field F and let a € End(V) be nilpotent. Does Ba 
have to be nilpotent for all 6 € End(V)? 


Exercise 854 
Let V be a vector space over a field F and let a € End(V) be nilpotent. Find 


spec(a). 


Exercise 855 
Let V be a vector space finitely generated over C and let w € End(V) satisfy 
spec(a) = {0}. Show that a is nilpotent. 


Exercise 856 
Let V be a vector space over a field F and let a € End(V) be nilpotent, having 
index of nilpotence k. Find the minimal polynomial of a. 


Exercise 857 

Let V be a vector space finitely generated over a field F and let a € End(V) 
satisfy the condition that for each v € V there exists a positive integer n(v) sat- 
isfying a”) (v) = Oy. Show that « is nilpotent. Does the same result hold if V 
is not assumed to be finitely generated over F'? 


Exercise 858 
Let F be a field and let a be an endomorphism of F? represented with respect to 


0 a O 
a basis B of F? by the matrix | 0 0 b |, where a and b are nonzero scalars. 
0 0 0 


Does there exist a endomorphism f of F? satisfying B* = a? 


Exercise 859 

Let a be a nilpotent endomorphism of a vector space V over a field F’ having 
characteristic 0. Show that there exists an endomorphism f of V belonging to 
F[a] and satisfying B* = 0, + a. 
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Exercise 860 


Let w € End(R?) be represented with respect to the canonical basis by the matrix 
12 -2 1 
3 0 3 |. Calculate R[a] | 0 
1 1 -2 0 


Exercise 861 
Let V be the space of all infinitely-differentiable functions from R to itself. Let 
65 be the endomorphism of V which assigns to each function its derivative. What 


is R[8] sin(x)? 


Exercise 862 
a at+c 1 
Define w € End(R*) bya: | b | H | b—a |. Find R[a] | 1 
Cc b 1 
Exercise 863 


Let V be a vector space over a field F and let a € End(V). Let v € V be a vector 


satisfying F[a?]v = V. Show that F[a]v = V. 


Exercise 864 


Given a € R, let aq € End(R*) be represented with respect to the canonical basis 


0 a l 0 
: 1 —2 1 1 : : i : 
by the matrix 0 0 4 0}: For which values of a is the dimension of 
0) 1 0 -2 
0 
) 
R[ag] 0 equal to 3? 
—1 


Exercise 865 


Let w € End(R?) be represented with respect to the canonical basis by the matrix 
111 
0 1 0 |. Find the eigenvalues of a and the generalized eigenspace associ- 
0 0 2 

ated with each. 
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Exercise 866 

Let V be a vector space finitely generated over C and let w € End(V). Show 
that w is diagonalizable if and only if every generalized eigenvector of @ is an 
eigenvector of a. 


Exercise 867 
Let B = {v1, v2, v3} be the canonical basis of R°? and let a be the endomor- 


102 
phism of R? satisfying ®gg(a)=}0 1 O |. Show that W = R{v1, v3} and 
00 1 


Y = Rvp are complements of each other in R? and that each of these spaces is 
invariant under a. 


Exercise 868 
1 2 0 
Let A=] 0 2 0 | € M3 3(R). Find the Jordan canonical form of A. 
2 —-2 -!l 
Exercise 869 
—2 8 6 
LetA=]|—-4 10 6 | € M3 3(R). Find the Jordan canonical form of A. 
4 -8 -4 
Exercise 870 
0 a —b 
Let O 4 A €M3x3(C) be of the form | —a 0 c |, where a, b, and c are 
b —-c 0 


real numbers. What is the Jordan canonical form of A? 


Exercise 871 
Let A € M5 x5(Q) be a matrix in Jordan canonical form having minimal poly- 
nomial (X — 3). What does A look like? 


Exercise 872 
Give an example of a matrix in M4 ,4(R) which is not similar to a matrix in 
Jordan canonical form. 


Exercise 873 

Let V be a vector space finitely generated over a field F and let a and 6 be 
nilpotent endomorphisms of V represented with respect to some given basis by 
matrices A and B, respectively. If the matrices A and B are similar, does the 
index of nilpotence of a have to equal that of 6? 
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Exercise 874 


a 0 0 
Let F be a field and let A= ]1 a O]} € M3 3(F). Show that 
0 1 ia 
ak 0 0 
Ak = kak—! ak O | forallk > 0. 


sk(k— Wak? kak! ak 


Exercise 875 

Let n be a positive integer and, for all 1 <i, j <n, let pjj(X) € C[X]. Let 
ge: C>Mhuxn(C) be the function defined by gy : z+ [p;;(z)]. Furthermore, 
let us assume that y(z) is nonsingular for each z € C. Show that there exists 
a nonzero complex number d such that |g(z)| =d for all z EC. 


Exercise 876 
For each t € C, let a; be the endomorphism of C? represented with respect to the 


0 1 0 
canonical basis by the matrix | 0 0 0 |. Is the representation of a; in Jordan 
t 0 0 
canonical form dependent on t? 
Exercise 877 
0 0 4 
LettA=|0 O O] €M3x3(C). Find the set of all c € C satisfying the condi- 
0 0 0 
tion that cA is similar to A. 
Exercise 878 
1 -1 1 0 
Find the Jordan canonical form of A = 5 1 1 0 M3 3(R). 
0 0 0 


Exercise 879 
Let a € R be positive. Find the Jordan canonical form of 


a 0 0 1 
0 1 1 +0 
011 0\¢ M4x4(R). 
00 0 1 


Exercise 880 
Let AE My xn(R) differ from J and O. If A is idempotent, show that its Jordan 
canonical form is a diagonal matrix. 
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Exercise 881 
Let 1 4 A € M,x,(R) be an involutory matrix. Show that the Jordan canonical 
form of A is a diagonal matrix. 


The Dual Space 1 4 


Let V be a vector space over a field F’. A linear transformation from V to F (consid- 
ered as a vector space over itself) is a linear functional on V. The space Hom(V, F) 
of all such linear functionals is called the dual space of V and will be denoted by 
D(V). Note that D(V) is a vector space over F, the identity element of which for 
addition is the 0-functional, v +> 0. Since dim(F’) = 1, we immediately see that 
every linear functional other than the 0-functional must be an epimorphism. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach. 

Linear functionals were first studied systematically 
by the French mathematician Jacques Hadamard, 
whose long life ranged from the mid nineteenth 
century to the mid twentieth century, and by his 
student Maurice Fréchet. Their work on function- 
als turned them into a major tool in analysis. 


Example Let F bea field and let n be a positive integer. Any v € F” defines a linear 
functional in D(F”) bywR vOw. 


Example Let V be a vector space over a field F and let B be a basis for V. Each 
u € B defines a function f,, € F? defined by 


pes 1 ifu’=u, 
ee 0 otherwise, 

and by Proposition 6.2 we know that this function in turn defines a linear functional 
5, € D(V). In particular, if V = F” and if B = {u1,..., u,} is the canonical basis 

a\ 
for V, then dy, 2] : | +> ap foreach 1 <h <n. 

an 
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Example Suppose that V = C(a, b) and that gg € V. Then the function 7: V > R 
defined by n: fhe ii Ff (x)go(x) dx belongs to D(V). Hadamard’s initial work on 


linear functionals concerned those of the form f b> limy_so ri FS (x) gn (x) dx for 
suitable sequences g), g2,...in V. 


Example Let V be the subspace of R® consisting of all infinitely-differentiable 
functions f satisfying the condition that there exist real numbers a < b such that 
f(x) =0 if x ¢ [a, b]. Then the function f bh (ie f(x) dx belongs to D(V). EI- 
ements of D(V) are known as distributions and play an important role in analysis 
and theoretical physics. 


Note that the linear functional tr: Myyxn»(F) > F is not a homomorphism of 
F-algebras whenever n > 1. If (K,e) is an algebra over a field F then a nonzero 
linear functional 6 € D(K) which is also a homomorphism of F-algebras is called 
a weight function on K and an algebra having a weight function is called a baric 
algebra.' Nonassociative baric algebras are an important context for mathematical 
models in genetics. 

Let F be a field and let n be a positive integer. Then there exists a linear 
functional tr: Myxn(F) — F which assigns to each matrix the sum of the ele- 
ments of its diagonal, i.e., tr : [ajj] > ye , dii- This linear functional is called the 
trace. This functional will play an important part in our later discussion. Note that 
vOQw=tr(v A w) for all v, w € F”. 

If A =[a;;] and B = [b;;] are matrices in M,,,(F), then it is easy to see that 
tr(AB) = 7}_) oh—1 Ginbni = tr(BA). We also notice that tr(7) =n, where J is 
the identity matrix of Mn n(F). If the characteristic of the field F does not divide n, 
we claim that these conditions uniquely characterize the trace. 


With kind permission of the Clarke University Archives. 


The trace of a matrix was first defined by the nineteenth-century 
American mathematician Henry Taber. 


Proposition 14.1 Letn be a positive integer and let F be a field the charac- 
teristic of which does not divide n. Let 6 be a linear functional on Myyn(F) 
satisfying the conditions that 6(AB) = 6(BA) for all A, BE Myxn(F) and 
that 6(1) =n. Then 6 = tr. 


‘Such structures were first studied by the twentieth-century Scottish mathematician I.M.H. Ether- 
ington, who formulated the Mendelian laws algebraically. 
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Proof If (Hj; | 1 <i, j <n} is the canonical basis of M,xn(F), then it suffices to 
show that 6(H;;) = tr(Hj;) for all 1 <i, j <n. In particular, if 1 <i, 7 <n then 
6(A;;) = 6 (Ai; Hi) = 6 (Aji Aij) = 6 (Ajj). Since J] = a Hj;, this implies that 
n=6d(1)= wa 6(Hj;) and so 6( Ajj) = 1 = tr(Aj;) for all 1 <i <n. Ifi ¢ j then 
Aj Ai = O and so 5(Aij;) = 6 (Hi; Mj) = 6( Aj j iy) = 5(O) =0= tr(Hi;), and 
we are done. 


By the above, we see that if F is a field, if n is a positive integer, and if A, Bé€ 
Mnaxn(F), then tr(A e B) = tr(AB) — tr(BA) = 0, where e denotes the Lie product 
on Myyn(F). In fact, over fields of characteristic 0 the converse is also true. In 
order to establish this fact, we first need a technical result. 


Proposition 14.2 Let F be a field of characteristic 0 and let n be a positive 
integer. Let A€ Mn xn(F) have the property that it is similar to no matrix in 
Maxn(F) having a 0 for its (1, 1)-entry. Then A is a scalar matrix. 


Proof Clearly, A is not O and so there exists a vector w € F” satisfying 
0 
AlwsH : |. Assume that there exists a vector v € F” satisfying the condition 


0 
that w © v =1 and (A’w) © v =0. Let 5 € D(F") be given by db: yrRrHeowOy. 
Then the nullity of 6 is nm — 1 and we can pick a basis {y2,..., yn} for ker(S). Since 
v ¢ ker(6), we see that the set {v, y2,..., yn} is linearly independent. Therefore, the 
matrix P the columns of which are v, y2,..., y, is nonsingular, and w! is the first 
row of P—!. Moreover, the (1, 1)-entry of the matrix P-'AP is (A’w) Ov=0, 
contradicting the assumption on A. This means that there is no vector v satisfying 
the given conditions and so A! w =cyw for some scalar cy € F. Thus we conclude 
that if w is any vector in F” then A’ w is either the 0-vector or a scalar multiple of 
w and so, for any nonsingular matrix Q € Myxn(F) we see that O-'AO = [bij] 
is a diagonal matrix. If byn A beg and if y is the difference between the Ath and kth 
rows of Q~!, then A? y cannot be of the form cyy, which is again a contradiction. 
Thus A must in fact be a scalar matrix. 


Proposition 14.3 Let F be a field of characteristic 0, let n be a positive inte- 
ger, and let C € Myxn(F). Then tr(C) = 0 if and only if C is the Lie product 
of matrices A, BE Myxn(F). 


Proof We have already seen that if C is the Lie product of two matrices in 
Mnaxn(F) then tr(C) = 0. We will prove the converse by induction on n. The re- 
sult is clearly true if nm = 1 so we can assume, inductively, that n > | and that the 
result has been established for matrices in M;,%(F) for any k <n. Moreover, if 
C = O, take A = B = O and we are done. Hence we can assume that C 4 O. 
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Since F has characteristic 0 and tr(C) = 0, we know that C is not a scalar ma- 
trix. By Proposition 14.2, this means that there is a nonsingular matrix P such 
that P~'CP has a 0 for its (1, 1)-entry. That is to say, we can write P~'CP in 


T 
block form as B s | where x,z € F”~! and C’e M (n-1)x(n—1) (F). More- 


C’ 
over, tr(D) = tr(P~'C P) = tr(C) = 0 and so, by the induction hypothesis, there 
exist matrices A’, B! € Mi—1)x(n—1)(F) satisfying C’ = A’B’ — B’A’. If A’ is 
singular, then we can replace A’ by A’ — c’l for any scalar c’ ¢ spec(A’) (and 
such an element c’ exists since F has characteristic 0 and hence is infinite). 
Therefore, without loss of generality, we can assume that A’ is nonsingular. Then 
T a/-l 
P-'CP = A" B” — B" A", where A” = B a and B” = as - — 
Thus C = (PA” P~')(PB" P~') — (PB"P~!)(PA"P~'). 


The first of many proofs of this result was given by Kenjiro Shoda, 
one of the major figures in the twentieth-century Japanese algebra. 
The proof given here is due to Kahan. 


Example Note that this result may be false if the field F’ has positive characteristic. 


For example, if F = GF(2) then tr (EF : 


| = 0 but there are no matrices A and 


B in M?2x2(F) satisfying AB — BA= i ‘ 


Thus, if F has characteristic 0 then the set of all matrices C € Myxn(F) satisfy- 
ing tr(C) = O forms a subalgebra of the general Lie algebra Myy.,(F)~, called the 
special Lie algebra defined by F”. 

If n is a positive integer, F is a field, and P is a nonsingular matrix in Myxn(F), 
then tr(PAP~') = tr(PP~'A) = tr(A) and so similar matrices have identical 
traces. In general, if B and C are fixed matrices in M,x,(F) then the functions 
At tr(BA) and Ab tr(AC) belong to D(Myyn(F)). 

The following result shows that traces essentially define all linear functionals on 
spaces of square matrices. 


Proposition 14.4 Let F be a field, let n be a positive integer, and let 
6 € D(\Mnxn(F)). Then there exists a matrix C € Myxn(F) satisfying 
6: At tr(AC) forall AE Mnxn(F). 
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Proof For each 1 <i, j <n, let Hj; be the matrix having (i, j)-entry equal to 1 and 
all of the other entries equal to 0. Then we know that the set of all such matrices 
is a basis for Mnxn(F). Let C = [cij] € Maxn(F) be the matrix defined by cjj = 
6(Hj;) for all | <i, j <n. Then for each matrix A = [ajj] € Mnxn(F) we have 
8(A) = 800 ny aij Ai) = Vay ja ij (Ai) = Va Ve aajeyi = 
tr(AC). 


Proposition 14.5 Let F be a field, let n be a positive integer, and let 
6 € D(Myyn(F)) be a linear functional satisfying 5(AB) = 6(BA) for all 
A, BE May xn(F). Then there exists a scalar c € F such that 6(A) = c - tr(A) 
forall AE Mnyxn(F). 


Proof Again, for each | <i, j <n, let Hj; be the matrix having (i, j)-entry equal 
to 1 and all of the other entries equal to 0. If 1 <i # j <n then 5(Hjj) = 
6( Hii Hij) = d(H; Hii) = 6(O) = 0. Moreover, for all 1 < j,k <n we have 
6(Hj;) = 6(Ajx Ag) = 6 (Aj je) = 6( Ake). Thus we see that there exists ac € F 
such that 6(H;;) =c for all 1 < j <n and from Proposition 14.4 we conclude that 
6(A) =tr(A- cl) =c - tr(A) for all A Ee Maxn(F). 


Proposition 14.6 (Taber’s Theorem) Let F be a field, let n be a positive 
integer, and let A € My xn(F) be a matrix the characteristic polynomial of 
which is completely reducible. Then tr(A) is the sum of the eigenvalues of A 
(with the appropriate multiplicities). 


Proof Let p(X) = 7} ci X ‘ be the characteristic polynomial of A. We know that 
this polynomial is completely reducible, say p(X) = []/_, (X — bj), and after mul- 
tiplying this out, we see that c,_1 = — }~_, bj. But from the definition of the char- 
acteristic polynomial, we also see that c,_; = —tr(A). Thus we see that, for any 
such matrix, tr(A) is the sum of the eigenvalues of A (with the appropriate multi- 
plicities). 


Example Let F bea field, let Q be a nonempty set, and let V = F®. For each a € Q 
there exists a linear functional 6, € D(V) defined by evaluation: 6, : ft f(a). In 
the case that F is R and Q is the unit interval of the real line, this functional is 
known to physicists as the Dirac functional. In analysis, evaluation functionals are 
often used to establish boundary conditions on classes of functions being studied. 
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Paul Dirac, the Nobel-prize-winning twentieth-century British physi- 
cist, built the first accepted model of quantum mechanics, in which lin- 
ear functionals played a fundamental part. 


Example Let n be a positive integer and let c and d be complex numbers. Can we 
find all matrices A € Myxn(C) having the property that c is an eigenvalue of A 
having geometric multiplicity n — 1 and d is an eigenvalue of A having geometric 
multiplicity 1? (Certainly one such matrix always exists, namely a diagonal matrix 
with c appearing n — | times on the diagonal and d once.) In general, in order for c 
to be an eigenvalue of A of geometric multiplicity n — 1, the eigenspace associated 
with it has to be of dimension n — 1. In other words, the nullity of the matrix A — cI 
must equal n — 1. From this we see that the dimension of the column space of A—cI 
by el 
must equal 1, and so there must exist nonzero vectorsu=] : | andvu=| : | in 
bn en 
C” such that A—cI =u Av, whence A =u Av-+cl. Conversely, if A is a matrix of 
the form u A v + cl, then c is an eigenvalue of A having multiplicity at least n — 1. 
Note that tr(A) = 4 bje; +nc. But, as we just noted, tr(A) is also the sum of the 
eigenvalues of A, counted by multiplicity, and so we want it to equal d+ (n — L)c. 
Thus we are reduced to finding vectors u and v as above satisfying the condition 
that )~"_, bie; =d —c. This is easy to do in concrete cases. 


The following proposition shows that there always enough linear functionals to 
enable us to distinguish between vectors. 


Proposition 14.7 Let V be a vector space of a field F . If v # w are elements 
of V then there exists a linear functional 6 € D(V) satisfying 6(v) 4 6(w). 


Proof Since the set {v — w} is linearly independent, it can be completed to a basis 
B of V. By Proposition 6.2, there exists a linear functional 6 € D(V) satisfying 
6(v — w) = 1 and 6(u) = 0 for all u € B \ {v — w}. This is the linear functional we 
want. 


In particular, if Oy #£ v € V then there is a linear functional 5 € D(V) satisfying 


5(v) £0. 
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Proposition 14.8 Let V be a vector space of a field F. Then D(V) = F®. In 
particular, if V is finitely generated over F then D(V) = V. 


Proof We have a function a: D(V) > F® given by restriction, and it is straight- 
forward to check that this function is an R-homomorphism. Since every element 
of D(V) is totally defined by its action on a basis, this function is monic and, 
by Proposition 6.2, it is epic. Therefore, it is an isomorphism. If B is finite, then 
VFM = FF. 


In particular, we see the important relationship between F“) and F® for any 
nonempty set Q, namely that F° is isomorphic to the dual space of F‘), Note too 
that D(V) 4 V whenever the vector space V is not finitely generated over F since 
it can be shown, using the arithmetic of transfinite cardinals, that F (2) and F° are 
never isomorphic when is infinite. 

Let us consider the idea inherent in Proposition 14.8. Let V be a vector space 
over a field F and let B be a given basis for V. For each v € B, let 6, € D(V) 
satisfy dy(v) = 1 and 6,(u) = O for all v Au € B. We claim that E = {6, | 
v € B} is a linearly-independent subset of D(V). Indeed, if cj,..., cy, are scalars 
in F and uy,...,uU, are elements of B satisfying the condition that pe Ci8u; 
is the 0-functional. Then for all 1 < h <n we have 0 = (977, cidu;)(Un) = 
71 Cidu; (Un) = Cn. This establishes the claim. If V is finitely-generated then B is 
finite and so E is a basis for D(V), since it is easy to check that 6 = ee: d(u)by 
for all 6 € D(V). Such a basis E for D(V) is called the dual basis of the basis B 
for V. If V is not finitely generated, then F'E is a subspace of D(V) composed of 
all those linear functionals 6 € D(V) satisfying the condition that 6(u) ~ 0 for at 
most finitely-many elements u of B. This subspace is called the weak dual space 
of V. 


Example Let V be the vector space of all polynomial functions in R® having de- 
gree at most 4. Suppose that B = {a 1, ..., as} is a set of distinct positive real num- 
bers and, for each 1 <i <5, let 6; € D(V) be the linear transformation defined 
by 6; : p(t) > i p(t)dt. We claim that B = {6,,...,65} is a basis for D(V). 
Indeed, since we know by Proposition 14.8 that dim(D(V)) = 5, all we have to 
show is that the set B is linearly independent. That is to say, we must show that if 
there exist real numbers bj, ..., bs satisfying the condition that ae b; 6; is the 0- 
functional, then bj = 0 for all 1 <i <5. Since )7}_, bi5i(t") = 0}_ gai bi 
a) ap 3a) gay 5a 
ay 303 303 40; 503 
for all 0 < h < 4, we must show that is nonsingu- 


Ue 


tad 13 14. 145 
as ee a ee ig ee 
lar, which is the case since this is just a nonzero scalar multiple of a Vandermonde 
matrix. 
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Example Let a < b be real numbers and let t,,...,¢, be distinct real numbers in 
the closed interval [a,b] of the real line and let W be the subspace of R® con- 
sisting of all polynomial functions of degree less than n. Then dim(W) = n. For 
each | <i <n, let 6; € D(W) be the linear functional defined by 6; : p + p(t). 
We claim that the subset B = {6),...5,} of D(W) is linearly independent. Indeed, 
if }*"_, c76; is the 0-functional and if 1 <A <n, then 0 = ()~"_, ci4;) Tl j+n (X -— 
tj) =Ch II Gah (tn —t;), which implies that c;, = 0 since the ¢; are distinct. Therefore, 
by Proposition 14.8, B is a basis for D(W). Since the function p t> Hee p(x) dx also 
belongs to D(W), we conclude that there exist real numbers c,..., Cy, satisfying 
is p(x) dx = )~"_, ci p(t;) for any p € W. 


Let V and W be vector spaces over a field F and let a € Hom(V, W). If 
6 € D(W) then da € D(V). Moreover, if 61,62 € D(W) and if v € V then 
[(d1 + d2)a](v) = (61 + d2)a(v) = dj a(v) + d2a(v) = [61a + 62a] (v) and so (6) + 
62)a = 6a + d2a. Similarly, if c € F and if 6 € D(W) then c(da@) = (cd)a. There- 
fore, we see that a defines a linear transformation D(a) € Hom(D(W), D(V)) 
by setting D(a): dt da. If V, W, and Y are vector spaces over F and if 
a € Hom(V,W) and 6 € Hom(W,Y) then it is straightforward to show that 
D(Ba) = D(a) D(B). If @ is an isomorphism, then D(q@) is also an isomorphism, 
where D(a)~! = D(a7!). 


Proposition 14.9 Let F be a field and let V and W be vector spaces finitely 


generated over F. Let B = {v1,..., vg} be a basis for V, the dual basis of 
which is C = {61,..., 6x}, and let D = {w1,..., Wn} be a basis for W, the 
dual basis of which is E ={n,...,n}. Ifa: V — W isa linear transforma- 


tion then Pgc(D(a)) = gpa). 


Proof Let @gp(a) = [aij]. For each 1 <i < k we have a(v;) = yD H1 Ghi Wh 
and so for all 1 <i <k and all 1 < j <n we have [D(a)(nj)](vj) = nja(vj) = 
a1 Minj (Wn) = aji. But each 6 € D(V) satisfies 5 = > 5(v;)6; and so, in 
particular, D(a)(nj) = “f_,[D(@)(nj)1(@i)5; = YK, aji6;, which gives the de- 
sired result. 


We have already seen that, given a vector space V over a field F, we can build 
the dual space D(V). Since this too is a vector space over F’, we can go on to 
built its dual space, D*(V) = D(D(V)). What do some elements of this space look 
like? Each v € V defines a function 0, : D(V) —> F by setting @) : dt d6(v). This 
is indeed a linear functional and so is an element of D?(V), which we call the 
evaluation functional at v. 
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Proposition 14.10 Let V be a vector space over a field F. The function 
v+> y is a monomorphism from V to D*(V), which is an isomorphism in 
the case V is finitely generated. 


Proof We first have to show that this function is a linear transformation. And, in- 
deed, if v,w € V, if a € F, and if 6 € D(V), then as a direct consequence of 
the definitions we obtain 6,4,(6) = 6(v + w) = d(v) + 6(w) = Oy (6) + Oy (6) = 
[Oy + Ow](S) and so Oy4y = Oy + Oy. Similarly, Ogy (6) = 6(av) = ad(v) = a, (6) 
and so @gy = a0,. Thus we have shown that we do indeed have a homomorphism. 
If v belongs to the kernel of this function then 6,(6) = 6(v) for all 6 € D(V) and 
so, by Proposition 14.5, we know that v = Oy. Thus it is a monomorphism. Finally, 
if V is finitely generated then, by Proposition 14.6, we see that dim(D?(V)) = 
dim(D(V)) = dim(V) and so any monomorphism from V to D?(V) has to be an 
isomorphism. 


We should note that the importance of Proposition 14.10 lies not in the existence 
of an isomorphism between V and D?(V), which could be inferred from dimension 
arguments alone, but in finding a specific, natural, such isomorphism. 

A proper subspace W of a vector space V over a field F is a maximal subspace if 
and only if there is no subspace of V properly contained in V and properly contain- 
ing W. By the Hausdorff Maximum Principle, we know that any nontrivial vector 
space contains a maximal subspace. The maximal subspaces of finitely-generated 
vector spaces are usually called hyperplanes of the space. We will now use linear 
functionals in order to characterize these subspaces of V. 


Proposition 14.11 A subspace W of a vector space V over a field F is max- 
imal if and only if there exists a linear functional 6 € D(V) which is not the 
0-functional, with kernel W. 


Proof Let us assume that W = ker(6), where 6 is a linear functional which is not the 
0-functional, and assume that there exists a proper subspace Y of V which properly 
contains W. Pick ye Y \ W and x € V \ Y. These two vectors have to be nonzero 
and the set {x, y} is linearly independent by Proposition 5.3, since Fy C Y and 
x €Y. Set U = F{x, y}. Then ker(6) and U are disjoint, so the restriction of 6 to 
U is a monomorphism, which is impossible since dim(U) = 2 and dim(F) = 1. 
Therefore, W must be a maximal subspace of V. Conversely, let W be a maximal 
subspace of V and let ye V \ W. Then Fy W = {Oy} and Fy + W = V by the 
maximality of W. Therefore, V = Fy ® W and so every vector in V can be written 
in the form ay + w, where a € F and w € W. The function d:ay+wtraisa 
linear functional in D(V) the kernel of which equals W. 
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Proposition 14.12 Let V be a vector space over a field F and let 5,5,,...,5n 
be elements of D(V). Then 6 € F{é1,...,5n} if and only a an ker(6;) € 
ker(6). 


Proof Assume that 6 € F{6,,...,6,}. Then there exist scalars aj, ..., a, such that 
5 = or, aj5j. If v € (-\_, ker(6;) then 6;(v) = 0 for all 1 <i <n and so d(v) = 
yy aj6;(v) = 0. Thus v € ker(6). Conversely, suppose that yet ker(6;) C ker(d). 
We will proceed by induction on n. First, assume that n = 1. If 4 is the 0-functional, 
then surely we are done. Thus let us assume that this is not the case and let v € 
V \ ker(6). Since ker(6) C ker(6), this means that 6; (v) 40. Seta = 6; (v)~!3(v). 
Then 6(v) = ad, (v) = (a6,)(v) and so v € ker(6 —a6,). But ker(6,) € ker(6 — a1), 
and so this containment is again proper. By Proposition 14.11, ker(d)) is a maximal 
subspace of V and so ker(6 — ad) = V, which shows that 6 = ad. 

Now let us assume that we have prove the result for a given n and assume we 
have linear functionals 6, 6;,...,6,4; in D(V) satisfying (is ker(4;) € ker(6). 
Set W = ker(6,41) and for each 1 <i <n let §; be the restriction of 46; to W. 
Also, let 6 be the restriction of 6 to W. Then (i ker(Bi) C ker(6) and so, by 
the induction hypothesis, we know that there exist scalars aj,...,d, such that B = 
7", ai Bi. Therefore, ker(5n41) C ker(S — 77, aj6;) and, as in the case n = 1, it 
follows that there exists a scalar aj4 1 such that 6 — ae a0; = An+16n+1, proving 
that 6 = "tT aj6;. 


In the context of functional analysis, the following consequence of Propo- 
sition 14.11, taken together with the Riesz Representation Theorem (Proposi- 
tion 16.14), is known as the Fredholm alternative, and has many important applica- 
tions. 


The Swedish mathematician Ivar Fredholm was active in the late 
nineteenth century and studied the solvability of integral equations. 


Proposition 14.13 Let V and W be vector spaces over a field F, let 
a € Hom(V, W), and let w € W. Then w € im(a) if and only if w € ker(6) 
for any 6 € D(W) satisfying im(@) C ker(64). 


Proof Yf w € im(q) then the given condition clearly holds. Conversely, assume that 
w ¢ im(q@) and let B be a basis for im(@). By Proposition 5.3, the set {w}U B 
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is linearly independent, and so there exists a subset B’ of W containing B such 
that {w} U B’ is a basis for W. Then FB’ is a maximal subspace of W and so, by 
Proposition 14.11, there exists a5 € D(W) satisfying 5(w) 4 0 and im(@) C FB’ = 
ker(6). 


Exercises 


Exercise 882 

Let V = C(O, 1). From calculus we know that for each f € V there exists a max- 
imal element ay of { f(t) |0<t < 1}. Is the function f +> a, a linear functional 
on V? 


Exercise 883 

Let W be a subspace of Q[X] generated by a countably-infinite linearly- 
independent set {p;(X), p2(X), ...} of polynomials. Let 6: W > Q be the func- 
tion defined by 5: )°7°, aj pi(X) +> °°, a; deg(p;) (where only finitely-many 
of the a; are nonzero). Does 6 belong to D(W)? 


Exercise 884 
Let F = GF(2) and let 5: F? — F be the function which assigns to each vector 


a 

v= | Db | the value (0 or 1) appearing in the majority of entries of v. Is 6 a linear 
C 

functional? 


Exercise 885 
Find a linear functional 6 € D(R?) which is not the 0-functional but which satis- 


3 3 
fies 5 2 =6 2 =0. 
—1 1 
Exercise 886 


Let V = Q[X] and to each vector v = [b1, bo, ...] € Q™ assign a linear func- 
tional 5, € D(V) defined by 3, : 072.9 an X" > Og nan by +1. Is the function 
a: Q®* — D(V) defined by v +> 6, an isomorphism? 


Exercise 887 
Let V be a vector space over a field F and let a, 8 € D(V) satisfy the condition 
that ker(B) C ker(a). Show that a € Ff. 


Exercise 888 
Let F be a field and letO 4a € F. Leta: F[X]— F be the function defined by 
a: p(X) p(a) — p(0). Is a a linear functional? 
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Exercise 889 

Let F be a field of characteristic 0 and let n be a positive integer. Show that 
any matrix A € M,x»(F) is similar to a matrix all diagonal entries of which are 
equal to 0. 


Exercise 890 
Let V = R? and consider the linear functionals 


a a 
65: |b | th 2a—b+3c, b2:| b|th3a—5b+c, and 
c c 
a 
63:| b| Re 4a—Tb+ec 
G 


on V. Is {61, 62, 63} a basis for D(V)? 


Exercise 891 

Let V be a vector space finitely generated over a field F and let W be a subspace 
of V having a complement Y in V. Show that D(V) = W’ @ Y’, where W’ isa 
subspace of D(V) isomorphic to W and Y’ is a subspace of D(V) isomorphic 
to Y. 


Exercise 892 

Let n be a positive integer and let V be the vector space of all polynomial func- 
tions from R to itself of degree no more than n. For allO <k <n, letd,:V—>R 
be the function defined by 6; : pte i. t* p(t) dt. Show that {5,,...,6,} is a 
basis of D(V). 


Exercise 893 
0 0 1 

Let B= 31,] 11],] —1 | $ CR. Find the dual basis of B. 
29 = 3 


Exercise 894 

Let n be a positive integer and let V be a vector space of dimension n over a 
field F. Let B = {61,...,5,} be a subset of D(V) and assume that there exists a 
vector Oy £ v € V satisfying 6;(v) = 0 for all 0 <i <n. Show that B is linearly 
dependent. 


Exercise 895 

Let V be a vector space over a field F. For every subspace W of V, let E(W) = 
{56 € D(V) | ker(5) > W}. Show that E(W) is a subspace of D(V). More- 
over, if {W; | i € Q} is a family of subspaces of V, show that EQ ice Wj) = 
Mice E(Wi). 
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Exercise 896 

Let V be a vector space finitely generated over a field F and let W be a subspace 
of V. For E(W) = {6 € D(V) | ker(6) > W}, show that dim(W) + dim(E(W)) = 
dim(V). 


Exercise 897 

Let V be a vector space finitely generated over a field F’, let W be a subspace 
of V, and let Y be a subspace of D(V). Are the following conditions equivalent: 
(1) Y = {6 € D(V) | ker(5) > W}; 

(2) W =f sey ker(5)? 


Exercise 898 
Let A, B € M2 x2(R). Show that tr(A B) = tr(A) - tr(B) if and only if |A + B| = 
|A| + |B]. 


Exercise 899 

Let n be a positive integer and let U be a finite subset of My,(C) which is 
closed under multiplication of matrices. Show that there exists a matrix A in U 
satisfying tr(A) € {1,..., 7}. 


Exercise 900 

Let n be a positive integer and let F be a field. For any matrix A = [aj;] € 
Manxn(F), define the antitrace of A to be antitr(A) = )7_, din+1—i. Is the func- 
tion A +> antitr(A) a linear functional on My yn (F)? 


Exercise 901 
Let F be a field and let A € M2,.2(F) be a matrix satisfying tr(A) = tr(A”) = 0. 
Is it necessarily true that A= O? 


Exercise 902 
Let k and n be positive integers. If O 4 A € Mxxn(R), does there necessarily 
exist a matrix B € M,,.,(R) satisfying tr(AB) 4 0? 


Exercise 903 
Let F be a field and let k #£n be positive integers. Let A € Myxn(F) and 
BeEM)yxx(F). Are tr(AB) and tr(BA) necessarily equal? 


Exercise 904 


1 2-i 1+i 1 1+i 2-i 
Show that the matrices |} 4+7 1+i7 0 and| 3-71 1+i 0 in 
1+i 1 1 1 27 1-i 


M3 3(C) are not similar. 


Exercise 905 
Let n be a positive integer and let V be the subspace of R[X] composed of all 
polynomials of degree at most n. What is the dual basis of {1, X,..., X”}? 
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Exercise 906 
Let n be a positive integer. If B and C are elements of My x»(R) satisfying 
tr(B) < tr(C), and if A € Myx, (R), is it necessarily true that tr(AB) < tr(AC)? 


Exercise 907 
For a matrix A € M3,,3(R), find a positive integer c satisfying 


tr(A) 1 0 
|A|=—|tr(A2) tr(A) 2 
“\tr(A3) tr(A2)_ tr(A) 


Exercise 908 

Let k and n be positive integers and let F be a field. Define a function 
a: Minxkn(F) > Mnxn(F) as follows: if A € Minxkn(F), write A = [Aj;], 
where each Aj; is a (k x k)-block. Then set (A) = [bij] € Maxn(F), where 
bjj = tr(Ai;) for each 1 <i, j <n. Is @ a linear transformation? Is it a homomor- 
phism of unital F-algebras? 


Exercise 909 

Let A be a nonempty set and let V be the collection of all subsets of A, which is 
a vector space over GF(2). Is the characteristic function of @ 4 D C A a linear 
functional on V? 


Exercise 910 
For each integer n > 1, find a nonsingular matrix A € M, x,(Q) satisfying 
tr(A) = 0. 


Exercise 911 
Let n > 1 be an integer and let A € M,,,(R). Does there necessarily exist a 
symmetric matrix B € M,»,(R) satisfying tr(A) = tr(B)? 


Exercise 912 

Let V be a vector space finitely generated over Q and let a € End(V) be a pro- 
jection. Show that there is a basis D of V satisfying the condition that the rank 
of a equals tr(Ppp(a)). 


Exercise 913 

Let W be a proper subspace of a vector space V over a field F andletue V\ W. 
Show that there is a linear functional 6 € D(V) satisfying 6(v) 4 0 but 6(w) =0 
for allw e W. 


Exercise 914 

Let V be a vector space finitely generated over a field F and let W; and W2 be 
proper subspaces of V satisfying V = W; ® W2. Show that D(V) = FE; @ Ep, 
where FE; = {5 € D(V) | Wj Cker(6)} for j = 1, 2. 
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Exercise 915 
Let F bea field and let A € M2,,2(F'). Show that we always have A2—tr(A)A+ 
|A|J =O. 


Exercise 916 

Let V be the subspace of R® consisting of all sequences [a1, a2,...] € R® 
satisfying the condition that limj_... aj exists in R. Define linear functionals 
61, 62,..-,do00 € D(V) by setting 6, : [aj,a2,...]+> ay, for each h = 1,2,... 
and doo : [a1, a2,...] > limj.o0 a;. Is the subset {6), 52, ..., d40} of D(V) nec- 
essarily linearly independent? 


Exercise 917 

Let F be a field and, for each a € F, let eg : F[X] — F be the linear functional 
defined by &g : p(X) > p(a). Show that the subset {e, | a € F} of D(F[X]) is 
linearly independent. 


Exercise 918 

Let V be a vector space over a field F and let 5, 62 € D(V) be linear functionals 
satisfying the condition that 6) (v)d2(v) = 0 for all v € V. Show that one of the 
6; must be the O0-functional. 


Exercise 919 

Let n > | be an integer and let f : R” — R be a continuous function which maps 
the 0-vector to 0 and which satisfies the condition that f(v + w)+ f(v-—w)= 
2 f (v) for all v, w € R”. Show that f € D(R”). 


Exercise 920 
Let a ER, let n be a positive integer, and let A, B € My x»(R). Does there nec- 
essarily exist a matrix C € M,x,(R) satisfying AC + tr(C)A = B? 


Exercise 921 

Let F be a field and let n be a positive integer. Let 6: Mnxn(F) > F be 
the linear functional given by 6 : [ajj] > )7j_) Ee , %j- Find an endomor- 
phism a of Mnxn(F) satisfying the condition that 6(A) = a - tr(@(A)) for all 
Ae Maxn(F). 


Exercise 922 
Let F be a field and let n be a positive integer. Let A, B € Myx (F) be matrices 
satisfying A* + B* = J and AB + BA = O. Show that tr(A) = tr(B) = 0. 


Exercise 923 
Let F be a field and let n be a positive integer. Given a positive integer k, is it 
necessarily true that tr((AB)*) = tr(A*)tr(B*) for all A, BE Mnxn(F)? 
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Exercise 924 
Let V be a vector space finitely generated over a field F and let aw € End(V). 
Show that w and D(q@) have identical minimal polynomials. 


Exercise 925 

Let V be a vector space over a field F and let n be a positive integer. Let 
V],..., U, be distinct vectors in V and assume that there exist a € End(V) and 
6 € D(V) such that the matrix [da'—!(v;)] € Mnxn(F) is nonsingular. Show 
that the set {v),..., v,} is linearly independent. 


Exercise 926 
Let W be a subspace of a vector space V over a field Ff’. Show that W is a maxi- 
mal subspace of V if and only if every complement of W in V has dimension 1. 


Exercise 927 
Let F be a field and let n be a positive integer. For matrices A, BE Mpnxn(F), 
calculate tr([AB — BA][AB-+ BA)]). 


Exercise 928 

Let n be a positive integer. Can we find matrices A, B € Mynx (C) satisfying the 
condition that all eigenvalues of A and of B are positive real numbers, but not all 
eigenvalues of A + B are positive real numbers? 


Exercise 929 
Let k and n be positive integers, let F be a field, and let O4 A € Myxn(F). 
Does there necessarily exist a matrix B € My xx(F) satisfying tr(A B) 4 0. 


Exercise 930 

Let V be a vector space of finite dimension 7 over a field F. A nonempty finite 
collection {W,,..., Wx} of hyperplanes of V is co-independent if and only if 
dim(\_, W;) =n —k. Is anonempty subcollection of a co-independent collec- 
tion of hyperplanes necessarily co-independent? 


Exercise 931 
If V is a vector space over R then the complexification of D(V) is isomorphic to 
Homp(V, C) as vector spaces over C. 


Exercise 932 
Let F be a field and let k and n be positive integers. If A € Mixn(F), are 
tr(AA’) and tr(A? A) necessarily equal? 


Exercise 933 
Let F be a field of characteristic other than 2. Show that any matrix A € M2 x2(F) 
can be written in the form c/J + B, where c € F and tr(B) = 0. 
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In this chapter, we will have to restrict the set of fields over which we work. A sub- 
field F of R is real Euclidean if and only if for each 0 <c € F there exists an 
element d € F satisfying d* = c and a subfield K of C is Euclidean if and only if 
there exists a real Euclidean field F such that K = {a+ bi |a,be€ F}. It is immedi- 
ately clear that if K is a Euclidean field and c € K, then c € K. Being a Euclidean 
field is intimately tied in with the constructibility of elements of the complex plane 
by straightedge and compass constructions, and in fact every real Euclidean field 
must contain all those real numbers which are then lengths of line segments obtain- 
able from the unit line segment by straightedge and compass construction methods. 
Clearly, R itself is real Euclidean, while Q, as we have already noted, is not; the set 
real numbers algebraic over Q is real Euclidean and properly contained in R. The 
field C is Euclidean, and the set of all algebraic numbers is Euclidean and properly 
contained in C. 

Let V be a vector space over a Euclidean field F. A function 4 from V x V to 
F is an inner product on V if and only if: 

(1) For each w € V, the function v  j(v, w) from V to F is a linear functional; 

(2) If v, w € V then p(v, w) = p(w, v); 

(3) If v € V then jz(v, v) is a nonnegative real number, which equals 0 if and only 
ifv=O0y. 

Note that, in the above situation, if v, w ¢ V then, as a consequence of (2), 
(vu, w) + “(w, v) = 2Re((v, w)) is also always a real number, though it may, 
of course, be negative. 

In general, once we have fixed an inner product on a space, we will write (v, w) 
instead of (v, w). A vector space over a Euclidean subfield F of C on which 
we have an inner product defined is called an inner product space. Another term for 
such a space, coming from functional analysis, is a pre-Hilbert space. Abstract inner 
product spaces were first studied in an axiomatic manner by von Neumann. While 
inner product spaces over general Euclidean fields may prove to be interesting in 
the future, at the moment the study of such spaces is almost universally restricted 
to spaces over R or C, and so from now on we will do the same and consider 
only these as possible fields of scalars. When we talk about an inner product space 
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without specifying the field of scalars, we will always assume that it is one of these 
two fields. 


Example Let n be a positive integer and let F' be either R or C. We define an inner 
aj bj 

product on F”, called the dot product, as follows: if v = : and w = : |, 
_ an bn 

then we set v- w = )~_, ajb;. Note that if F = R, then this product just coincides 

with the interior product v © w which defined earlier. However, that is not true 

for the case F = C, so we must be very careful to distinguish between the two 

products. This modification of the definition is necessary since, over C, we have 

1 1 1 ci ; 
, © |; = 0, even though ke # lo | Hence the interior product © is not an 


i 
inner product as we have defined it in this chapter. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Artin). 
: The problem arises because, in C, 0 can be writ- 
y ten as the sum of squares of nonzero elements. 
A field F in which 0 cannot be the sum of squares 
of nonzero elements of F is formally real; so R 
is formally real while C is not. The theory of 


formally real fields was developed in the 1920s 
by the Austrian mathematicians Emil Artin and 
Otto Schreier. 


We can generalize the previous example. If F is either R or C, and if D = [dj;] 

is a nonsingular matrix in. M,.,(F), we can define an inner product on F” by set- 
ay by by 

ting ae ea ee = [a ees dn | DD" : |, where D# = [dij]? . The 
an by bn 

matrix D” is called theconjugate transpose or Hermitian transpose of D, and it 

again belongs to Mnxn(F). Conjugate transposes of matrices over C will play an 

important part in the following discussion; of course, D” = D’ for any matrix 

Dé Mnxn(R). 

The properties of the conjugate transpose are very much like those of the trans- 
pose. Indeed, we note that if A, B € Mnxn(C) andc €C, then (A+ B)4# = A% + 
BH, (cA)# =cA", AM# = A, and (AB)” = B47 AF In particular, if A is nonsin- 
gular then J = [4 = (AA~!)# =(A7~!)4A4, proving that (A~!)# =(A7)~!, 


Example If we are given positive real numbers c1,...,¢c, and consider the diag- 
onal matrix D = [djj] € Mnxn(R) the diagonal entries of which are given by 
dij = /ci for 1 <i <n, then, by the above, we have an inner product on C” given by 
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ay by 
,{ => ne cjajb;. Such a product is called a weighted dot product. 
an bn 
Weighted dot products are extremely important in statistics and data analysis, where 
we often want to emphasize the values of certain parameters and de-emphasize oth- 
ers. 


Example Let a < b be real numbers and let V = C(a,b). This is, as we have 
seen, a vector space over R, on which we can define an inner product (f, g) = 
f is Ff (x)g(x) dx. Continuity is important here. The set Y of all functions from [a, b] 
to R which are continuous at all but finitely-many points is a subspace of R!4?! 
properly containing C(a,b) but (f, g) = f, : f(x)g(x) dx is not an inner product 
on Y. Indeed, if we select a real number c satisfying a < c < b and define the func- 
tion f € Y by 


1 ifx=c, 
0 otherwise 


fixes | 


then f is a nonzero element of Y but (f, f) =0. 
Similarly, if V be the set of all continuous complex-valued functions defined on 
the closed interval [a, b] in R, then V is a vector space over C, on which we can 


define an inner product ( f, g) = i. S(x)g(x) dx. 


Example Let F be R or C, and let V = My xn(F), which is a vector space 
over F’. Define an inner product on V by setting (A, B) = tr(AB”) — tr(B# A). If 
A= [aj] and B = [b;;], then (A, B) = )7y_, = aj; b;;. In particular, (A, A) = 


n n 2. 
i ja laijl- 


Example Let V be the subspace of C° composed of all those sequences [co, c1,..-] 
of complex numbers satisfying )°?2o |ci |> < oo. This vector space is very impor- 
tant in analysis, and we can define an inner product on it by setting ([co, c1,...], 
[do, di, oe J) = wer cidj. 


Let F be R or C, and let W be a subspace of an inner product space V over F. 
The restriction of this inner product to a function from W x W to F is an inner 
product on W. Thus we can always assume that any subspace of an inner product 
space V inherits the inner-product-space structure of V. 


Example Let V be an inner product space over R and let K be the set of all matrices 


a : 
of the form + where a,b € R and v, w € V. Then K is a vector space over 


v 
b ? 
R, where addition and scalar multiplication are defined by 


qv} ja v|_fatd vt d a v|_|ca cv 
w b w’ db) |we+w’ b+bd' ae la bl New bl" 
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We create the structure of an R-algebra on K by defining an operation e as follows: 
a ou a ou aa’ + (v,w') av’ +b'v 
E ole el aw + bw! ere 
This algebra is a division algebra, called the Cayley algebra, and it is not associative. 


We now look at some properties of general inner product spaces. 


Proposition 15.1 Let V be an inner product space. For v,w ,, w2 € V and 
for a scalar a, we have: 

(1) (v, wy + w2) = (vu, wi) + (v, w2); 

(2) (v,awi) =a(v, wi); 

(3) (Ov, w1) = (v, Oy) =0. 


Proof From the definition of the inner product, we have (v,w, + w2) = 


(wy + wa, v) = (wy, v) + (wa, v) = (v, wy) + (v, w2) = (v, wt) + (v, w2), which 
proves (1). We also have (v,aw ) = (aw), v) = a(wy, v) = a(wy, v) = av, v4), 
which proves (2). Finally, (Oy, wi) = (OOy, wi) = 0, and similarly (v, Ov) = 0, 
proving (3). 


By Proposition 15.1 we see that if V is an inner product space over R then for 
each v € V the function w+» (v, w) from V to Fis a linear transformation, but that 
is not the case for inner product spaces over C. 

Let V be a finitely-generated inner product space and let vj,..., vg; be a list of 
vectors in V. The Gram matrix of this list is the k x k matrix G = [g;;] defined 
by gij = (uj, v;) for all 1 <i, 7 <k. Let B= {v},..., vn} be a basis for V. Given 
vectors v= )>y_, ajvj and w= vi=l bjv; in V, we note that 


non by 


(v,w) = 2 >> ajbj (vj, vj) = [a1 ee, ity || 2 


i=1 j=!) b 
where G is the Gram matrix of B. 
Jorgen Gram was a Danish mathematician who at the end of the nine- 


teenth century developed computational techniques for inner product 
spaces in connection with his work for insurance companies. 
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Example Let V be the subspace of C[X] consisting of all polynomials of degree 
at most 5, and let B be the canonical basis for V. Define an inner product on V 
by setting (f, g) = i f(x)g(x) dx. (Note that we are using the same notation for 
a polynomial and its corresponding polynomial function in C™.) Then the Gram 
matrix defined by B is precisely the Hilbert matrix He, which we have seen ear- 
lier. 


Proposition 15.2 (Cauchy-Schwarz—Bunyakovsky Theorem) Let V be an 
inner product space. If v, w € V, then |(v, w)|? < (v, v)(w, w). 


Proof If v = Oy or w = Oy then the result is immediate, and so we can assume that 
both vectors differ from Oy. Let a = —(w, v) and b= (v, v). Then a = —(v, w) and 
b=b so 
0 < (av+ bw,av+ bw) =aa(v, v) + ab(v, w) + ba(w, v) + b> (w, w) 
= aab — aba — aba + b?(w, w) = b[-aa + b(w, w)]. 


Since v 4 Oy, it follows that b is a positive real number and so aa < b(w, w), which 
is what we want. 


With kind permission of ETH-Bibliothek Zurich, Image 
Archive (Schwarz). 

Herman Schwarz was a German mathematician 
who in the late nineteenth century studied spaces 
of functions and their structure as inner product 
spaces. Viktor Yakovlevich Bunyakovsky was a 
Russian student of Cauchy who proved this the- 
orem a generation before Schwarz, but since his 
work was published in an obscure journal, it was not widely recognized until the twentieth 
century. 


Example If a,...,@n,b1,...,bn,C1,-.-, Cn are real numbers, with c; > 0 for all 


1 <i <n, then 
n n 
< ( Sout) ( Sat). 
i=l i=l 


Indeed, this is a consequence of the Cauchy—Schwarz—Bunyakovsky Theorem, us- 
aj by 
ing the weighted dot product 2 14]: = )°"_, ciajbj defined on R”. 


n 
) cpajbj 


i=l 


an bn 
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In general, the Cauchy—Schwarz—Bunyakovsky Theorem is an extremely rich 
source of inequalities between real-valued functions of several real variables. For 
Vat+b 1 
example, consider the vectors v = Ta J/a+tec|andw=| 1 | in R3, where 
+b+ ae 
a Cc 7 zD E 1 
a, b, and c are positive. Then, by the Cauchy—Schwarz—Bunyakovsky Theorem, we 
see that 


/ a+b +/ ate +/ b+c Ses ewe 


at+b+e a+b+c atbt+c 


Similarly, we note that the matrix D = ies vA € M?2x2(R) is nonsingular and 


so, by a previous example, we have an inner product jz on R? defined by 


i (i 1‘) = Bi Dp! El = 3(ac + bd) + (V3)(ad + be). 


Applying the Cauchy—Schwarz—Bunyakovsky Theorem, we see that for all real 
numbers a, b, c, and d we have 


[3(ac + bd) + (V3)(ad + be) |? 
< [3(a? +b?) + (2V3)ab][3(c? + d?) + (2V3)cd]. 


In particular, if we take b= d = /3, we see that (actat+ct+ 3)2 < (a2 +2a+ 
3)(c? + 2c + 3) for all real numbers a and c. 

Let V be an inner product space. The norm of a vector uv € V is defined to be the 
scalar ||v|| = /(v, v). A vector v satisfying ||v|| = 1 is normal. 


Example Let V = R”, and endow V with the dot product. Then 


a| 


an i=l 
This norm is known as the Euclidean norm on V. 


Example Let V = C(—z,7), on which we have defined the inner product 
(f,g)= ‘hee ft (x)g(x) dx. For each positive integer k, consider the function 


Sk ix b> sin(kx). Then || fell = Wk, fe) = ion sin?(kx) dx = ./m and so 


gk = 5 J, is anormal vector in this space. 


We have seen how the vector space R?, endowed with the cross product x, isa 
Lie algebra. It is easy to check that the cross product is related to the dot product on 
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R? by the relations 


ux(vxw)=(u-w)v—(u-v)w_ and 


(uxv)Xw=(u-w)v—(v-w)u 


for all u,v, w € R*. Moreover, we have the following identities: 
(1) v- (v x w) =0 for all v, w € R’; 
(2) (Lagrange identity) ||v x w||? = |Ju||?||w||? — (v- w)?. 
There are only two possible anticommutative operations on R? which turn it into 


an R-algebra satisfying these two identities, namely x and the operation x’ given 
by v x’ w = —(v x w). Furthermore, if n > 3 no such operation can be defined 
on R”, except for the case of n = 7. In that case, we can define an operation x as 
v 
follows: write elements of R’ in the form | a |, where v, v’ € R? anda € R and 
v’ 
then set 
v w aw’ — bv’ + (vx w)—(v' x w’) 
x| b l= —v-w+ov'-w 
v w’ bu—aw+(v x w’)—(v' x w) 
a by cl 
We also note that if w= | a2}, v= | bo |, and w = | co | in R® then 
a3 b3 C3 
a bh cy 
u-(vxXw)=/a2 bz c2|. As an immediate consequence, we observe that if 
a3 b3 C3 


u,v, w € R? then: 

(1) u-(v x w)=v-(w xu) =w-: (ux v); 

(2) u-(v x w) = 0 if and only if two of these vectors are equal or the set {u, v, w} 
is linearly dependent. 

The scalar value u - (v x w) is often called the scalar triple product of the vectors 

u, VU, W, to distinguish it from the vector triple product u x (v x w). 


Proposition 15.3 Let V be an inner product space. If v, w € V and if a isa 
scalar, then: 

(1) |lav|| = la - [lull 

(2) ||v|| = 0, with equality if and only if v = 0y; 

(3) (Minkowski’s inequality): \|v + w|| < ||v|| + ||wll; 

(4) (Parallelogram law): \|v + w||? + |v — wll? = 2(\lvl]? + [lwll); 

(5) (Triangle difference inequality): ||v — w|| = |||v|| — || wll. 
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With kind permission of ETH-Bibliothek Zurich, Image Archive. 


Hermann Minkowski, a German mathematician at the end of the nine- 
teenth century, built an elegant mathematical framework for the theory 
of relativity, using four-dimensional non-Euclidean geometry. 


Proof We see that ||av|| = /(av, av) = Jaa(v, v) = |a|- ||v||, proving (1). Inequal- 
ity (2) follows immediately from the definition. As for (3), note that if z=a-+ bi 
then z+ Z = 2a < 2\a| = Wa? < 2a? +b? = 2|z|. As a consequence of the 
Cauchy—Schwarz—Bunyakovsky Theorem, we see that |(v, w)| = |(w, v)| < |lv|| - 
||w ||, and so 


Ju + wil? = (v+w,v+w) = (v, v) + (v, w) + (w, v) + (w, w) 
< |lv|? + 2[lvll fell + wll? = (llell + Hel)”. 
and that proves (3). Moreover, we know that 
lv + wil? = (v+ w, vt w) = (v, v) + (v, w) + (w, v) + (w, w) 


and ||v — w||? = (v— w, v— w) = (v, v) — (v, w) — (w, v) + (w, w). Adding these 
two gives us (4). 


Finally, by (3), we have ||w|| = ||w + (v — w)|| < ||w|| + |lu — wl, and so 
|v — w|| => |u| — ||w||. Interchanging the roles of v and w and using (1), gives 
us ||v — w|| = ||w — v|| = ||w|| — || vl], and so we have (5). 


Note that by Proposition 15.3 we see that if Oy #£ v € V then pre is a normal 
vector. Moreover, if v is normal and c is a scalar satisfying |c| = 1, then cv is again 


normal. 


Example Let V be an inner product space, and let 2 be a nonempty set. A function 
f € V® is bounded if and only if there exists a real number b f Satisfying || f(i)|| < 
by forallic Q.1f frge V® are bounded functions then, from Minkowski’s in- 
equality, we conclude that ||(f+ 9@I|| <|IFOl+Ig@Oll sof +b, forallie 2. 
If c is a scalar then ||(cf)@)|| = |el- | f@I| < |clby for alli ¢ 2. Thus both f + g 
and cf are both bounded, and we see that the set of all bounded elements of V® is 
a subspace of V%. 


Example We now return to a previous example. Let p be an integer greater 
than 1, not necessarily prime, and let G = Z/(p), on which we have an oper- 
ation of addition as defined in Chap. 2. Let V = C°%, which is a vector space 
of dimension p over C. On this space, we can define an inner product by set- 
ting (f, 2) = neg f(n)g(n). Every element n € G defines a function hy : k 
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cos( 2ank) +i sin(?2"") which belongs to V. Given a function f € V, define a func- 
tion fe V as f: nt (filtn) = Deeg f ln(—k). This function is called the 
discrete Fourier transform of f of order p. One can show that the function f +> f 
is in fact an automorphism of V. Moreover, f(n) = 1 f(—n) and || f|| = 
for all f € V andalln eG. 


hae 
sIfll 


Example There are various generalizations of Theorem 15.2 which, as a rule, re- 
quire more sophisticated methods of complex analysis to prove. For example, the 
contemporary Greek mathematicians Manolis Magiropoulos and Dimitri Karayan- 
nakis have shown that if V is an inner product space and if u, v, and w are distinct 
elements of V, then 


2|(u, v)|- [(u, w)| < (wu, u)[llull - wil + |(v, w) |]. 


In case the set {v, w} is linearly dependent, it is clear that this reduces to the inequal- 
ity in Proposition 15.2. Inequalities such as these allow us to get better bounds on 
inner products. For example, let 0 < a < b be real numbers and let V = C(a, b), on 
which we have the inner product (f, g) = ey f(x)g(x) dx. If u,v, w € V are given 
by u:xt> 1/x, uv: xt sin(x), and w: x +> cos(x) then Proposition 15.2 gives us 
the bound 


b dx b 4 b ; 
u,v] Ju w9] = (f =) [ sin (x) dx [0s (x) dx 


whereas this result gives us the better upper bound 


1 b dx i - 2 4 2 
(/ =) [ sin (x) dx [ 0s (x) dx + : 


Proposition 15.4 Let V be an inner product space and let a € End(V) satisfy 
the condition that there exists a real number 0 < c < 1 such that ||a(v)|| < 
cllv|| for all v € V. Then oj + a@ is monic. 


b 
i sin(x) cos(x) dx 


a 


Proof If Oy #v € V then, by Proposition 15.3, 


lvl] = |v + e(v) —a(v)|| < lv +a(v)|| + oa(r)| 
= | (o1 + a)(v) | + lav) 


’ 


and so ||(o1 + a)(v)|| => |lull — lle(v)|| => d — c)|lv|| > 0, which shows that 
v ¢ ker(o, +a). Thus o; + a is monic. 


In particular, if V is a finitely-generated inner product space and if a € End(V) 
satisfies the condition that there exists a real number 0 < c < | such that ||a(v)|| < 
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c||v|| for all v € V, then oj +a € Aut(V). Let 6 = (oj +a)—!. If Oy A~ve V then 


lull = | (or +) B(v)|| = |B) + eB (r)|| = | BO)|] — eB O)| 
= |B)| —c]£@]| =A-—o] 6]. 


Similarly, ||v|| < ||B(@) I] + lab) Ss IA@)I + cllB(@) || = A + ©)||B(@)|| and so 
ellell <IBQ)I < elu for all ve V. 

Sometimes, however, we need a bit more generality. If V is a vector space over 
R or C then, in general, a function v + ||v|| satisfying conditions (1)-(3) of Propo- 
sition 15.3 is called a norm and a vector space on which a fixed norm is defined is 
called a normed space or, in a functional-analysis context, a pre-Banach space. An 
immediate question is whether every norm defined on a vector space comes from an 


inner product. The answer is negative: if, for example, we define the norm || - ||; on 
a} 

C” by setting : = )~"_, |aj|, then this cannot come from an inner product 
an 


1 
since the parallelogram law is not satisfied by this norm. In fact, satisfying the paral- 


lelogram law is necessary for a norm to come from an inner product in the following 
sense: let V be a vector space over R or C on which we have anorm yw: V > R 
satisfying y(v+ w)* +W(v — w)* = 2[W(v)? + W(w)’] for all v, w € V, and write 
A(v, w) = alv(u +w)?— wiv - w)*]. Then it is possible to define an inner prod- 
uct on V relative to which the norm of a vector v is precisely y(v). In the case 
the field of scalars is R, then this inner product is defined by (v, w) = A(v, w) and 
otherwise this inner product is defined by (v, w) =A(v, w) + iA(v, iw). 


With kind permission of the Archives 
of the Mathematisches Forschungsin- 
stitut Oberwolfach (Wiener); © Ste- 
fan Banach (Banach). 

Normed spaces were first 
studied at the beginning of 
the twentieth century by the 
Austrian mathematician Hans 
Hahn, and then by the Ameri- 
can mathematician Norbert Wiener and the Polish mathematician Stefan Banach. 


Example Every vector space over R can be turned into a normed space in at least 
one way. Indeed, let V be a vector space over R for which we fix a basis {v; | i € Qh}. 
Then the function y : V — R defined by  : )Ujcg aivi  icg |ai| can easily be 
seen to be a norm on V. 


Example Let V = C(0, 1), which is a vector space over R, and for each positive 
integer n, let f, € V be the function defined by 


1 


1—nx if0<x <>, 


ina | 0 otherwise. 
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Let || - || be the norm defined on V by the inner product (f, g) =f, St (x)g(x) dx 
and let || - [loo be the norm on V defined by || f |loo = sup{| f(x)| | O <x < 1}. Then 
ll fnll = Tx, for all positive integers n, whereas || fn ||oo = 1 for all positive inte- 


gers n. i there can be no real number c satisfying || f'|loo <c|| f|| for all f € V. 


Example Let V and W be normed spaces over the same field of scalars F (which is 
either R or C). If a € Hom(V, W), set 


llor(v) I 
Ill 


la = sup| ov 4vev| 


where the norm in the numerator is the one defined on W and the norm in the denom- 
inator is the one defined on V. (If V is trivial then the only such @ is the 0-function, 
the norm of which we set equal to 0.) Note that the fraction ||a@(v)||/||v|| is just 
||~(v’) ||, where v’ is the normal vector To v, So we see that ||a|| is just sup{||a(v’) ||}, 
where the supremum runs over all normal vectors v’ in V. In particular, if 6 € D(V) 


then we define the norm of 4 to be 


sl = sup — Oy #ve vt. 


Note that ||a|| may not be finite, though it surely will be if a is bounded. For 
example, let V be the space of all polynomial functions in R® on which we define 
the norm || f|| = max{ f(t) | 0 <+t < 1}. Let @ be the differentiation endomorphism 
of V and, for each h > 1, let fy, € V be given by f, :xt> x”. Then 


lof) _ 
I fall 


for each h > 1, showing that ||q|| is infinite. If V is finitely generated, then we assert 
that ||o|| is finite for all a e Hom(V, W), a claim which we will justify in the next 
chapter. 


We claim that, if ||@|| is finite for all w, then this is anorm defined on Hom(V, W), 
called norm induced by the respective norms on V and W. Indeed, as an immediate 
consequence of the definition we see that ||a|| > 0 for all a e Hom(V, W), with 
equality happening only when a is the 0-function. We also see that if a is not the 
0-function then ||@|| is the smallest positive real number c such that ||a@(v)|| < cllv|| 
for all v € V. (We note a subtle point here: the norms on V and W are, of course, 
different. Therefore, in the case V = W, and if we have two different norms defined 
on V, we may use one in the numerator and another in the denominator, though 
usually one uses the same norm in both instances.) 
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Now let a € Hom(V, W) anda eé F. Then 


aa] =sup| Mn ov #vev| 
=su jee oy eve] = lal: ta. 


Finally, if a, 6 € Hom(V, W) then 


la + Bll = sup{ MeO On |v eve v} 
= sup, ROBO Oy Ave v} 


- sup| le(u) | + BOI 


+ ov #vev| <tal+ IAL 


If V = F” and W = F*, endowed with respective dot products and the norms de- 
fined by them, then the induced norm on Hom(V, W) does, in fact, always exist and 
is called the spectral norm. If A € Mxy,(F), then the spectral norm of A is defined 
to be the spectral norm of the homomorphism from F” to F* given by vt> Av. 

In 1941, Gelfand showed that if n is a positive integer and A € My xn(C), then 
the spectral radius of A satisfies p(A) = limps 99 W/|| A¥||, where || - || is any norm 


defined on M,,,.,(C). In other words, we see that, given A € M,.,(C), there exists 
a sufficiently large k such that || A*|| is approximately equal to p(A)*. 


Example If p is any positive integer, we can define the Hélder norm || - ||) on C” by 
ay 

setting : = poe lai ? | '/P For the case p = 2, this, of course, reduces to 
an Pp 

the norm coming from the dot product. The proof that this is a norm in the general 

case relies on a generalization of Minkowski’s inequality: ||v+ wl p < |lullp + |lwllp 

for all v, w € C” and any positive integer p. This norm can be used to define a norm 

on Hom(C", C*) for positive integers k and n, by setting 


lly = sup Oe 


IlUll p 


ov zvev| 


for any a € Hom(C", C*). 
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With kind permission of The City University of New 
York (Bowker); With kind permission of UAL, FS N 191 
(Holder). 

General matrix norms were first discussed by the 
twentieth-century American mathematician Al- 
bert H. Bowker. The nineteenth-century German 
algebraist Otto H6lder was strongly influenced 
by the work of Kronecker. 


Example Let n be a positive integer. If A = [ajj] € Mnxn(R), set ||Allc = 
max{| > >7_1 Via ajcic)| | cis c, € {0, 1}}. This defines a norm on My yn(R), 
known as the cut norm. This norm has important applications in graph theory 
and combinatorics, but is hard to calculate. However, efficient methods of ap- 
proximating the cut norm of a matrix exist, making use of the following re- 
markable result, known as Grothendieck’s inequality: there exists a universal con- 
stant kg (not dependent of 7m) satisfying the condition that any normal vectors 
U1, +++ Un, W1,.--, Wn in R” and any scalars e1,..., €n,e),---,&, €{—I, 1} satisfy 
ie jn Usd Wj Ske Vin Vint ajjee,. The precise value of the constant 
kg, known as Grothendieck’s constant, has not been determined, but the French 
mathematician Jean-Louis Krivine has shown that 1.677...<kg < 1.782.... 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


The French algebraic geometer Alexandre Grothendieck is consid- 
ered one of the most influential of contemporary mathematicians. 


Example For positive integers k and n, we define the Frobenius norm or Hilbert- 
Schmidt norm of A = [aij] € Mixn(C) by 


kon 
Allg =/tr(AA#) = | °° Jaijl?. 


i=1 j=1 


This is precisely the norm coming from the inner product on Mxxn(C) given by 
(A, B) =tr(AB"”). If A € Mgyn(C) has spectral norm ||A|| and Frobenius norm 
|| Allg, then it is straightforward to show that || A|| < |All < (/) |All. 


For vector spaces V finitely generated over R or C, it does not matter which 
norm once chooses. To see this, we need the following preliminary result. 
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Proposition 15.5 Let {v1,..., Un} be a finite linearly-independent subset of 
a normed space V. Then there exists a positive real number c such that 
| oF) aivill => cOlF_, lail) for all scalars ay, ..., an 


Proof Let W be the subspace of V generated by {v 1, ..., v,} and let Y be the subset 
of W consisting of all linear combinations )7_; aj v; for which }77_, |a;| = 1. Pick 
w= )>-7_, au; € W. If w = Ov, then a; = 0 for each i and so }~"_, |a;| = 0. There- 
fore, any positive real number c will do. Hence we can assume that w 4 Oy and so 
d=~"_, |a;| > 0. Moreover, y = ~"_, (aid7!)u; € Y and ||wl| > co, ail) if 
and only if || || > c. Therefore, to prove the proposition it suffices to show that there 
exists a positive real number c satisfying the condition that || y|| >c forall ye Y. 

Suppose that this is not the case. Then we can find a sequence yj, y2,... of vec- 
tors in Y such that y, = )0_, binvi with )77_, |bin| = 1 and limpj—oo || ya|| = 0. In 
particular, we note that |bj,| < 1 for each 1 <i <n and each h > 1. Thus, in par- 
ticular, the sequence b 1, b12, ... of scalars is bounded. By the Bolzano—Weierstrass 
Theorem (which holds for both real and complex numbers), this sequence must 
therefore have a convergent subsequence. Throwing away all of y, for which bj, is 
not in that subsequence, we can assume without loss of generality that the sequence 
bi, b12,... converges to some scalar b;. Similarly, the sequence b2;, b22,... has a 
convergent subsequence and, throwing away all of the y, for which bz, is not in that 
sequence, we can assume that the sequence b21, b22, ... converges to some scalar bz 
as well. Continuing in this manner, we finally obtain an infinite sequence y;, yo,... 
of vectors in Y such that, for each 1 <i <n, the sequence of scalars bj, bj2,... 
converges to some scalar bj. 

Set y = )7/_, bjv;. Clearly, y € W and so not all of the b; are equal to 0. In 
particular, y ~ Oy and so ||y|| =r > 0. On the other hand, for each h > | we have 
vil < lly — yall + llynll = WO Gi — Bindvs ll + lynll < CON 1b: — Bind + Meal) + 
lyn ||. But limp—soo || yn |] = 0 and limp_.o |b; — bin| = 0 for each 1 <i <n, and so 
there exists an integer h/ so large that ||y|| <7. This is a contradiction, from which 
the result follows. 


Norms || - ||¢ and || - ||p are defined on the same vector space V are equivalent 
if and only if there exist positive real numbers c and d such that cllv|lq < |lvllp < 
d|lv||q forallue V. 


Proposition 15.6 Any two norms defined on a finitely-generated vector space 
V over R or C are equivalent. 


Proof Let {v,,...,U,} be a basis for a vector space V over R or C on which we 
have norms || - ||z and || - ||, defined. By Proposition 15.5, there exists a scalar c such 
that || 0", ai ville = cCQCF_, lai|) for any vector v = )~"_, av; in V. On the other 
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hand, from the triangle inequality, we have ||v|ja < )77_) lail- llvilla <7 OL) lal, 
where r = max{||v1||c,---, llUn|la} > O and so (cr7!)|lulla < |lv||p foreach ve V. 


Interchanging the roles of || - || and || - ||), we repeat this proof to obtain a positive 
real number d such that ||v ||, <d||v||q for each v € V. 


Proposition 15.7 (Hahn-Banach Theorem) Let V be a vector space over a 
field F which is either R or C and let v > |\v|| be anorm defined on V. More- 
over, let W be a subspace of V and let 5 € D(W) satisfy the condition that 
|6(w)| < ||w|| for all w € W. Then there exists a linear functional 0 € D(V) 
which is an extension of 6 satisfying |O9(v)| < ||v|| forallue V. 


Proof (1) We first consider the case F = R. Let C be the set of all pairs (Y, w), 
where Y is a subspace of V containing W and wy € D(Y) satisfies the conditions 
that w(y) < ||y|| for all y € Y and y is an extension of 6. This set is nonempty since 
|d(w)| < ||w|| surely implies that 6(w) < || w|| and so (W, 5) € C. Moreover, we can 
define a partial order on C by setting (Y, Ww) = (Y’, w’) if and only if Y C Y’ and 
w'(y) = wQ) for all y € Y. If (Yn, Wn) | A € Q) is achain in C, set Y = Ujeg Ya 
and define y € D(Y) by setting (vy) = Wn(y) when y € Yj. This function is well- 
defined since C is a chain, and it surely belongs to C. Moreover, it is clear that 
(Yn, Wn) = (Y, w) for each h € Q. Therefore, by the Hausdorff Maximum Principle, 
C has a maximal element, which we will denote by (Yo, @). 

We want to show that Yo = V. Indeed, assume that this is not the case and let 
zE€VN Yo. Then Y; = Yo + Rz properly contains Yo and, for any co € R we can 
define the linear functional 6; € D(Y;) defined by 0; : yo +azt> 0(yo) +aco which 
surely is an extension of 6. We will be done if we can pick cg in such a manner 
that 61 (1) < |lyi|| for each y; € Y;, for if we can do that, then we would have 
(Y1, 8) €C, contradicting the maximality of Yo. If y1, y2 € Y; then 


O(y1) — 9(y2) = O91 — ya) S Iv — Yall 
=|ly1 +z-—z—yoll < lly +21] + l-z — yall. 


This implies that —|| — z— y2|]| — @(y2) < ||y1 +z|| —@(1). Since y2 does not appear 
on the right side of this equality nor does y; appear on the left side, we see that the 
real numbers 


d, =inf{|ly1 +zll -—901) ly eX} 
and 
dy = sup{—||—z — y2|l — O02) | y2 € y\} 


satisfy d2 < d,. Now choose cg to be any real number satisfying dz < co < d}. 
We claim that 6 (yo) + aco < || vo + az|| for all real numbers a. If a = 0, we know 
it is true by the choice of 6. If a > 0 we have co < d; < |la~! yo + z|| — O(a! yo) 
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and so acy < alla~!yo + zll — @(y0) = Ilyo + azll — O(y0), whence (yo) + aco 
< |lyo + azll. If a < 0 we have co > dy > —||—z — a7! yol] and so —aco > 
—al|—z — a~yoll + (yo) = —llaz + yoll + 900) and so —O(yo) — aco 
—|laz + yoll, whence 6(yo) + aco < |lyo + al. 

Thus we see that 0 € D(V) satisfies 0(v) < ||v|| for all ve V. If v € V. then 
—O(v) = A(-v) < ||-v] = |(— DI - Ilull = |u|] as well as so |@(v)| < |u|] for all 
v € V, proving our result in the case the field of scalars is R. 

(2) Now assume that F = C. Since W and V are vector spaces over C, they are 
also vector spaces over R. Write 6 as 5 : wb 6; (w)+162(w), where 5;, 52 € D(W), 
considering W as a vector space over R. Moreover, 5;(w) < |6(w)| for all w € W, 
since Re(z) < |z| for any z € C. Therefore, 5;(w) < ||w|| for all w € W and, as in 
the last part of the proof of part (1), we actually have |6;(w)| < ||w|| for all we W. 
By part (1), we then know that there exists a linear functional 6; € D(V) satisfying 
0, (v) < ||v|| for allv eV. 

But 7[5) (w) +i62(w)] = id(w) = d(iw) = 6, Gw) +162(7w) for all w € W. Since 
the real parts of both sides must be equal, we see that 52(w) = —6) (iw). Now define 
the function 0: V > C by setting 6 : vu 6) (v) — 19) (iv). This is a linear functional 
on V, considered as a vector space over C, since clearly 0(v + v’) = 6(v) + O(v’) 
for all v, v’ € V and for each a+ bi € C and v € V we have 


V 


6((a + bi)v) = 0 (av + ibv) — 16) (iav — bv) 
= a0) (v) + b6; (iv) — i[a6, (iv) — be, (v)] 
= (a + bi)[01(v) — 10) (iv) | = (a + ib) O(v). 


Furthermore, 0 is an extension of 6. 

We claim that |@(v)| < ||v|| for all v € V. To begin with, we note that if 0(v) = 
this holds, since ||v|| > 0 for all v € V. Now assume that ||v|| > 0. Then there ex- 
ists a real number r such that 6(v) = |@(v) |e” and so |6(v)| = @(v)e"". Since 
|O(v)| is real, this means that 6(v)e~'” € R and so |0(v)| = O(v)e""” = 6) (e7""v) < 
je" v|| = |e7!”| - |u|] = lvl]. Thus the proposition is proven. 


Proposition 15.8 Let V be anormed space and let W be nontrivial subspace 
of V on which we are given a linear functional 6, for which ||6|| is finite. Then 
there exists a linear functional 6 € D(V) which is an extension of 6 satisfying 
Ol] = lll. 


Proof For each w € W we have |d(w)| < ||6|| - || w||. Moreover, we have a norm 
vb |lul|* on V by setting ||v||* = {||| - ||v|] for all v € V. Therefore, by Proposi- 
tion 15.7, we know that there exists a linear functional 6 € D(V) extending 6 and 
satisfying |4(v)| < |lul|* = |[d]] - Jul], and so 2! < |[6}] for all Oy #u € V. Thus 


Rh 
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||| < ||6||. On the other hand, 


5 || = sop om Ow we w} < sup| — | 


Il wl 


Oy Ave v = ll6ll, 


and so we have the desired equality. 


The norm || - ||; defined on C” is important in various contexts. Let n be a pos- 
itive integer and let @ be the function from My xn(C) to R defined by @ : [ajj] 
max{>-j_4 laij|| 1 < j <n}, which we have already seen when we defined condi- 
tion numbers. Numerical algorithms that compute the eigenvalues of a matrix, as 
a rule, make roundoff errors on the order of cO(A), where c is a constant deter- 
mined by the precision of the computer on which the algorithm is running. Since 
the eigenvalues of similar matrices are identical, it is usually useful, given a square 
matrix A, to find a matrix B similar to A with 0(B) small. This can often be 
done by choosing B of the form PAP™!, where P is a nonsingular diagonal ma- 
trix. 


1 oO. 10-4 
Example If A= 1 1 10-2 |, then 0(A) = 1002. However, if we choose 
10* 107 31 
1° 0 O 
P=] 0 1. O |,then@(PAP™!)=3. 
0 oO 10 


Let a be the endomorphism of C” represented with respect to the canonical basis 
by a matrix A € My xn(C). Then for each v € C” we have 6(A)|lu||1 = lla) |I1. 
In particular, if c is an eigenvalue of a associated with an eigenvector v then 
A(A)llull1 = lla(@) [11 = Ie - |lull1 and so @(A) > |c|. Thus we see that 0(A) > p(A), 
where p(A) is the spectral radius of A. This bound is called the Gershgorin bound. 
In fact, we can sharpen this result. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Taussky-Todd). 

Semyon Aranovich Gershgorin was a twentieth 
century Russian mathematician. Gershgorin’s the- 
orem was published in a Russian journal in 1931 
and was generally ignored, until it was noticed and 
publicized by the Austrian-born American mathe- 
matician Olga Taussky-Todd, one of the most im- 
portant researchers in matrix theory, who worked on the development of numerical linear 
algebra methods for computers after World War II. 
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Proposition 15.9 (Gershgorin’s Theorem) Let a be the endomorphism of 
C" represented with respect to the canonical basis by the matrix A = [a;j] € 
Mnxn(C) and, for each 1 <i <n, let rj = DL ixi lajj|. Let K; be the cir- 
cle in the complex plane with radius r; and center a;;. Then spec(a) C K = 
Ui Kj. 


by 
Proof Let c be an eigenvalue of a and let v= | : | be an eigenvector of a as- 
by 
sociated with c. Let h be an index satisfying |b,| > |b;| for all 1 <i <n. Then 
by #0 and Av =cv so (c — ann)bp = ixh anjbj; and hence |c — apn||bal < 


are lanjbj| < |balrn. Thus |c — ann| < rp and so c € Kp C K, as desired. 


Proposition 15.10 (Diagonal Dominance Theorem!) Let n be a positive 
integer and let A = [ajj] € Mnxn(C) satisfy the condition that \aj;| > 
DL i#i |ajj| for all 1 <i <n. Then A is nonsingular. 


Proof The stated condition says that 0 does not belong to any of the circles K; 
defined in Gershgorin’s Theorem and so it cannot be an eigenvalue of A. Hence A 
is nonsingular. 


Example Let a be the endomorphism of C* represented with respect to the 


3 12 0 
canonical basis by the matrix A = = 2 : = . Then spec(A) = 
0 O 3 5 


{15.32, 4.49, 1.59 + 2.35i}. These numbers are found in the union K of the fol- 
lowing circles in the complex plane: the circle of radius 3 around the point (3, 0); 
the circle of radius 6 around the point (15, 0), the circle of radius 4 around the point 
(0, 0), and the circle of radius 3 around the point (5,0). We furthermore note that 
spec(A) = spec(A’) and so, by the same argument, we see that the eigenvalues of a 
lie in the union K’ of the following circles in the complex plane: the circle of radius 
7 around the point (3, 0), the circle of radius 1 around the point (15, 0), the circle 
of radius 5 around the point (0, 0), and the circle of radius 3 around the point (5, 0). 


'This theorem was proven by the French mathematicians L. Lévy and J. Desplanques at the end 
of the nineteenth century. It was independently rediscovered by several other algebraists, including 
Hadamard, Minkowski, and Nekrasov. 
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Dp}. 


These circles and the location of the eigenvalues can be seen in the figure above. 
Thus, the eigenvalues of a lie in KM K’. 


Since any polynomial in C[X] is the characteristic polynomial of a matrix, we 
can use Gershgorin’s Theorem to get a bound on the location of the zeros of any 
polynomial. However, there are more sophisticated methods available to get much 
better bounds. 

We will not go into the many results explicating Gershgorin’s Theorem. One of 
these, for example, states that if the union of s of the disks in the complex plane 
defined by Gershgorin circles forms a connected domain which is isolated from the 
disks defined by the remaining circles, then this domain contains precisely s of the 
eigenvalues of the given matrix. There are also many generalizations of Gershgorin’s 
Theorem, the best-known of which is the following. 


Proposition 15.11 (Brauer’s Theorem) Let a be the endomorphism of C" 
represented with respect to the canonical basis by the matrix A = [ajj] € 
Mnxn(C) and, for each 1 <i <n, let rj; = isi lajj|. For each 1 <iF 
j <n, let Kj; be the Cassini oval {z € C | |z — aji||z — ajj| < rirj} in the 
complex plane. Then spec(a) C K = Ui; Kjj. 


by 
Proof Let c be an eigenvalue of a and let v= | : | be an eigenvector of a asso- 
Dn 
ciated with c. Let h and k be indices such that |by,| > |bx| => |b;| for all i #h,k. 
We know that by, 4 0, and we can assume, as well, that by 4 0 for otherwise 
we would have ay~z = c, in which case surely c € K. Since Av = cv, we have 
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(c = Ann )bn = Lies anjbj Ne) 


Yo anjbj 


xh 


lc — ann\|bn| = 


< So lanjllbj| < >> lanjllbel = ralbel- 
j#h ixh 


In other words, |c — dnn| < rn|bx|lbn me In the same manner, we obtain |c — azx| < 
rz{bn||be|~! and so, multiplying these two results together, we see that |c — ann ||c — 
Ak| <Tnre, Soc © Kn C K, as desired. 


Note that Gershgorin’s Theorem involves n circles, whereas Brauer’s Theorem 


involves (5) = 5n(n — 1) ovals. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Brauer). 

The twentieth-century German mathematician Al- 
fred Brauer emigrated to the United States in 
1939; his research was primarily in matrix theory. 
Giovanni Domenico Cassini was a seventeenth- 
century Italian mathematician and astronomer. 


Example It is sometimes useful to consider norms on vector spaces V not over 
subfields of C, namely functions v b> ||v|| from V to R satisfying conditions (1)—(3) 
of Proposition 15.3. For example, let F be a finite field and let V = F” for some 
positive integer n. Define ||v|| to be the number of nonzero entries in v, for each 
v € V. This function is called the Hamming norm and is of extreme importance in 
algebraic coding theory, where one is interested in vector spaces over F in which 
every nonzero vector has a large Hamming norm. In an example at the beginning 
of Chap. 5, we showed a vector space of dimension 3 over GF(2), every nonzero 
element of which has Hamming norm equal to 4. 


With kind permission of the Special Collections & Archives, Dudley Knox Library, 
Naval Postgraduate School. 

Richard Hamming, a twentieth-century American mathematician 
and computer scientist, is best known for his development of the the- 
ory of error-detecting and error-correcting codes. 


If v and w are vectors in space V over which we have a norm defined, then 
the distance between v and w is defined as d(v, w) = |ju—w||. Ifue V and OF 
U CV, we define the distance of v from U by d(v, U) = inf{d(v, u) | u € U}. 
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When V = R” on which we have the dot product, this just gives us the ordinary 
notion of Euclidean distance. The ability to define the notion of distance in such 
spaces is important, since it allows us to measure the degree of error in algorithmic 
computations by measuring the distance between a computed value and the value 
predicted by theory. It also allows us to define the notion of convergence. 

The following proposition shows that this abstract notion of distance indeed has 
the geometric properties that one would expect from a notion of distance. 


Proposition 15.12 Let V be a normed space and let v, w, y € V. Then: 
(1) d(v, w) = d(w, v); 

(2) d(v, w) = 0, where equality exists if and only ifv = w; 

(3) (Triangle inequality) d(v, w) <d(v, y)+d(y, w). 


Proof This is an immediate consequence of Proposition 15.3. 


Example Let A be a finite set, and let V be the collection of all subsets of A, which 
is a vector space over F = GF(2). We have a norm defined on V by letting || B|| be 
the number of elements in B. Then the distance between subsets B and C of A is 
||B + C||, namely the number of elements in their symmetric difference. 


If A and B are nonempty subsets of a space V over which we have a norm 
defined, then we set d(A, B) = inf{d(v, w) | ve A and w € B}. In particular, if 
uv € V and B is a nonempty subset of V, we set d(v, B) = d({v}, B). 

Let n be a positive integer. If A = [ajj] € Mnxn(C), and if k > 0 is an in- 
teger, let us define the matrix P(k) = [pi to be J + a= nA". We claim 


that, for each fixed 1 <i, j <n, the limit limp_.o ie exists in C. Indeed, if 
B= [bij] € Maxn(F), set m(B) = max) <j, j<n |bjj|. Then every entry in the ma- 
trix A? equals the sum of n products of pairs of entries of A and so, in absolute 
value, is equal to at most m(A)*n. Thus we see that m(A) < m(A)2n. Similarly, 
m(A?*) < m(A?)m(A)n < m(A)?n? and so forth. Thus, in general, 


1 oa 1 
m( 74") < (Ay < si lntAmn]" 


and so, in particular, m(P(k)) < a am(Ayk for all k > 1. But from calculus we 
know that the series par ar converges absolutely to e” for each real number r. 
Therefore, the limit we seek exists, and, at least by analogy, we are justified in 
denoting the matrix [limp_, 5 Di | by e4. 
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Matrix exponentials were explicitly studied by the American mathemati- 
cian William Henry Metzler at the end of the nineteenth century. They 
appear earlier in the work of Laguerre and Peano. 


Proposition 15.13 [fn is a positive integer A = [aij] € Mnxn(F) is a diag- 
onal matrix, where F is R or C, then es [bij] is a diagonal matrix with 
bi =e foralll <i <n. 


Proof This is an immediate consequence of the definition. 


In particular, e? = 1. Moreover, this implies that if B € M,x,(F) is similar to 
a diagonal matrix then B and e? have the same eigenvectors, while the eigenvalues 
of e? are the exponentials of the eigenvalues of B. 


A; O.... O 
O Az ... O 
Actually, we can do a bit better: if A = . . ‘ , where each A; 

O O Am 
el O ... O 
O e%2 O 

is a square matrix, then e4 = 

O oO. ... eam 


Example If O# A € Myxn(F), where F is either R or C, is a nilpotent ma- 


trix with index of nilpotence k, then e4 = J + x 1 wA". Thus, for example, if 


0 1 2 11 3 
A=|0 0 -I|,wehaveeA=/+A44A7=|0 1 -1 
00 0 00 1 

0 k 


Example If A= 0 0 


wo cos(r)  sin(r) | 


€ M2,2(R), then e4 = E 1 }ite= & | then 


—sin(r) cos(r) 
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111 
Example If A=] 1 1 1 | €M3,x3(R), then 
111 
2 L3 13 1 i 3 1 
gq0ge seg Be = 3 
13 1 2 13 13 1 
Paige 3 srge 3h a 
1 3 1 13 1 2 1 3 
ae 3.9 =o Borge 


If P€ Myxn(F) is nonsingular, where F is R or C, then 
aa 
-1 h -1 
payee? =H AP)" 
h=1 


for each k and so P~!e4 Pp = e? '4?. Thus we see that the exponentials of similar 
matrices are themselves similar. This is very important in calculations. In particular, 
if A is diagonalizable there exists a honest matrix P such that P~!AP isa 
diagonal matrix D = [d;;] and so P~ 1e4 Pp =e? isalsoa diagonal matrix. Thus eA 
is diagonalizable whenever A is. 

If A, B is a commuting pair of matrices in My x n(F then as a direct conse- 
quence of the definition we see that e4e? = e4+8 = e% e4. But this is not true in 
general, as the following example shows. 


Example ra=|) 5 | ana =| o then 


ap_{l 1]fe! o]_ fet 1 e+ 1-e1!] arp 
PON ilo: sel ael a en 


Example The condition that A, B be a commuting pair is sufficient for e4e? = 


e4+8 to hold, but is not necessary. Thus, for example, if A = : A and 


—z 0O 
-| 0 (74+4/3)x 
(-7+4/3)x 0 


AGB — 7 = A+B 


| then AB + BA, but e4 = e8 = ~I 50 


This fact is significant when it comes to calculating e4 in many cases. For ex- 
ample, suppose that A is an n x n matrix having a single eigenvalue c of mul- 
tiplicity n. Then for each scalar rf, ‘ae matrices ctZ and t(A — cI) commute and 
se a ICO = CFI y 4 ie (A= c1)* and, from the Cayley—Hamilton 
Theorem, we know that (A — cl)‘ = O for all k > n. Thus we see that e/4 = 
(eT) Dope a - (A — cI)‘ and so there exists a polynomial p(X) € F[X] satisfy- 
ing e'4 = p(A). 
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Thus we can put much of what we said together. Given a matrix A € Myx,(C), 
we know by Proposition 13.7 that it is similar to a matrix in Jordan canonical form. 
That is to say, there exists a nonsingular matrix P such that P~! AP is of the form 

A; O.... O 


O A2 ... O 
. ; : , where each A; is a square matrix of the form c;/ + Nj, for 
O O.... Am 
N; anilpotent matrix of a particularly simple form. Thus, for each i, we have e4i = 
c1eN1 O a O 
O e2eN2 |, O 
ecieNi and so e4 = P : : . : P—!. Moreover, each ei 
O O Le eemeNm 


is just p;(N;) for some polynomial p;(X) € C[X]. 
We also note that any matrix A commutes with —A, so e 
0 =], proving that e4 is nonsingular and e~4 = (e4)~!. Therefore, we have a 
function A+ e4 from Myyn(F) (where F is either R or C) to the set of all non- 
singular matrices in My»(F), which is not monic. In the case F = C, this function 
is in fact epic. If A€ My xn(F) then a matrix BE My xn(F) is a matrix logarithm 
of A if and only if A = e8. From the previous discussion, we see that only non- 
singular matrices have logarithms. If F = C, then every nonsingular matrix has a 
logarithm, but not necessarily a unique one. 


AeA — gA-A — 


0 0 2x 0 
A and B are logarithms of 7, which are not even similar. 


Example If A= i | and B= | | then e4 = e8 = /. Therefore, both 


A similar proof can be used to show that if A has distinct eigenvalues {c1,..., Cn} 
and if px(X) = TT j4a(ce _ cj) 1X —cj1) for all 1 < k <n then for any scalar t 
we have e’4 = )7_, ek py(A). 

What about, say, cos(A) and sin(A)? We know that the cosine function has a 
Maclaurin representation 


iby 


cos(x) = >= a 


i=0 


For each natural number n, let us consider the polynomial 


n = i 
putX) =) Oi as 
i=0 


Then we can surely calculate p,(A) for each n and see whether the sequence of 
such matrices converges in some sense. However, there is another possibility. We 
know that for any real or complex number z we have cos(z) = slel + e—'*] and so 
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we can just define cos(A) to be the matrix sleiA +e 4), which we know always 


exists. 


cos(1) 0 0 
Example We see that cos(/) = 0 cos(1) 0 and 
0 0 cos(1) 
lid 5 + 5.c0s(3) + cos(3) - t 5.cos(3) = 5 
cos 1 11 — 5 cos(3) - 5 5 + 5.c0s(3) 5 c0s(3) _ ; 
oe 1 1 1 t 2-41 
zc0s(3)— 3 3zco0s(3)— 3 3+ 70c08(3) 


Similarly, we know that sin(z) = F Tel — e~'®] and so we can define sin(A) to 
be = [e!4 — e/4]. 


Exercises 


Exercise 934 
Let V = C(—1,1) and let a > —5 be a real number. Is the function wp : 


V x V—> R defined by (f, g) = eb — 17]*~'/? f(t) g(t) dt an inner product 
on V? 


Exercise 935 
Is the function jz : R* x R? > R defined by 


a2 


oe (| : 17) b> a (by + bp) + a(b1 + 2b2) 


an inner product on R?? 


Exercise 936 
Is the function jz: R* x R* > R defined by 


LL: (a ; kal) > a,b, — ayb2 — anb, + 4anb2 
a bo 


an inner product on R?? 


358 15 Inner Product Spaces 


Exercise 937 
Is the function jz: R? x R?* — R defined by 


a, by 
[: ar}\,| bo H> a,b, + 2anb2 + 3a3b3 +. a,b2 + arb 
a3 b3 


an inner product on R?? 


Exercise 938 
Verify whether the function « : R[X] x R[X]— R defined by w: (f,g) Bb 
deg( fg) is an inner product on R[X]. 


Exercise 939 
Give an example of a function yz : R* x R* > R which satisfies the first two 
conditions of an inner product, which does not satisfy the third, but does satisfy 


«([o} Lo) =" 


Is the function  : R[X] x R[X] — R defined by 


or) 00 oO 00 1 
LL: (doa! 3-52!) > Dea 
i=0 7=0 bard oe 


i=0' J=0 
an inner product on R[X]? 


Exercise 941 

Let V be the vector space of all continuous functions from R to itself. Let 
uu: V x V > Rbe the function given by ww: (f, g) + limysoo tf, St (s)g(s) ds. 
Is jz an inner product? 


Exercise 942 

Let V be a vector space over C and let w~: V x V > C be a function satisfying 
the following conditions: 

(1) For each w € V, the function v +> w(v, w) from V to C is a linear functional; 
(2) If v, w € V then p(v, w) = n(w, v); 

(3) If v € V satisfies w(v, w) = 0 for all w € V, then v= Oy. 

Is yz an inner product on V? 
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Exercise 943 
Let V be the vector space of all continuously differentiable functions from the 
interval [a,b] in R to R. If f, g € V, define the Sobolev inner product 


b b 
f8)= f fingcar+ | f'(t)g' (t) dt, 


where f’ and g’ are the derivatives of f and g, respectively. Verify that this is 
indeed an inner product on V. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 


Sergei Lvovich Sobolev was a twentieth-century Russian mathematician 
who worked primarily in functional analysis. 


Exercise 944 
Let {u, v} be a linearly-dependent subset of an inner product space V. Show that 
llu||>u = (v, wu. 


Exercise 945 


Let n be a positive integer and let v= | : | € R". Is the function py : 


Cn 
R[X1,..., Xn] x R[X1,..., Xn] — R defined by 


Hy = (D,q) +> P(C1, ++, Cn)q (C1, ++ +5 Cn) 
an inner product on R[X1,..., Xn]? 


Exercise 946 

Let a < b be real numbers and let V = C(a, b). Let hg € V be a function sat- 
isfying the condition that ho(t) > 0 for all a < t < b. Show that the function 
u:V x V—R defined by uw: (f, g)b rh Ff (x) g(x)ho(x) dx is an inner prod- 
ucton V. 


Exercise 947 
Let c and d be given real numbers. Find a necessary and sufficient condition that 


the function ju : (a ; i )) +> ca,b, + dazb2 be an inner product on R?. 
2 2 
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Exercise 948 
Is the function jz: R? x R? — R defined by 


a bi 
mw: | | ar], bo | | afby + bfan + (agb3)* 
a3 b3 


an inner product on R?? 


Exercise 949 
Let V be an inner product space over R and let n > 1 be an integer. For positive 


real numbers a1,..., @,, define the function uw: V"” x V” > R by 
VI WI ji 
be: eles lie > Y-aj(v;, wi). 
Un Wn — 


Is ~ an inner product on V”? 


Exercise 950 

Let n be a positive integer and let V be the subspace of R[X] consisting of all 
polynomials of degree at most n. Is the function uw: V x V > R defined by 
bi(p.qgre Yo PCG (4) an inner product on V? 


Exercise 951 
Let 0 <n € Z. Is the function pw: C” x C” > C defined by 


a| by ji 
Ms © Wed! = > So aibn—i4t 
i=l 


an bn 


an inner product? 


Exercise 952 

Let V = C? on which we have defined the dot product, and let D = {v € V | 
: 1 

|u|] = 1}. Find {(Av, v) | v € D}, where A = E °| € Max2(C). 

Exercise 953 

Let n be a positive integer and let {v,,..., vg} be a set of vectors in R” satisfying 


vj-v; <0 forall 1 <i < j <k. Show that k < 2n and give an example in which 
equality holds. 


Exercise 954 
Let n be a positive integer and let A € Mynxn(C) be idempotent. Is A” necessarily 
idempotent? 
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Exercise 955 
Let V be an inner product space and let v 4 v’ be vectors in V. Show that there 
exists a vector w € V satisfying (v, w) 4 (v’, w). 


Exercise 956 

Let V be an inner product space finitely generated over its field of scalars, and 
let B= {v1,..., Un} be a basis of V. Show that there exists a basis {w1,..., wn} 
of V satisfying the condition that 


1 ifi=j, 
(vi, wi) = Fe ee 
Exercise 957 
Let W be a subspace of a vector space V over R and let Y be a complement of 
W in V. Define an inner product 4 on W and an inner product v on Y. Is the 
function from V x V > R defined by (w+ y,w’+y) bh uw, w’) +0, 9’) 
an inner product on V? 


Exercise 958 
Let V=C(0, 1). LetA={fi,..., f,} be a linearly-independent subset of V and 
define a functionu: Rx R> Rbyu: (a,b) Bb Bee fj (@) cos/ (b). Show that 


if h € V and if there exists a function g € V such that h(x) = 7 u(x, y)g(y) dy 
for all x € R, then h € RA. 


Exercise 959 

Let V be an inner product space over R. For each real number a, set 
U(a) = {v € V | (v, v) < a}. Given a real number a, find a real number b such 
that (v + w,v+w) € U(b) forall v, w € U(a). 


Exercise 960 
Let V be an inner product space and let a € End(V ). Show that (a@(v), v) (v, a(v)) 
< |a(v)||? for every normal vector v € V. 


Exercise 961 
For real numbers aj, ..., d,, show that 


n n n 
$04([Sum)( Ean) 
i=1 il 


i=1 


Exercise 962 
(Binet-Cauchy identity) For u,v,w,y € IR3, show that (v x w)(y xX u) = 
(vu-y)(w-u)—(v-u)(y-w). 
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Exercise 963 
Let v be a normal vector in R*. Show that the function a, : R? > R? defined by 
Ayiwrhux (vx w)+w is a projection in End(R?3). 


Exercise 964 
Let V be an inner product space and let a € End(V) be a projection. Does it 
necessarily follow that ||a@(v)|| < ||v|| for all v e V? 


Exercise 965 


For u, v, w, y € R’, show that (u x v)-(w x y) = view vey 


u-w 4 


Exercise 966 


For nonnegative real numbers a, b, and c, show that 


(atb+ov2<Ve4+P?4+VP4+24+Va2+02. 


Exercise 967 
For real numbers 0 < a < b < c, show that 


Vb? + c2 < (V2)a < V(b —a)? + (c —a)?. 


Exercise 968 
Let n be a positive integer and let A € M,,,.,(R) be a matrix the n Gershgorin 
circles of which are mutually disjoint. Prove that all of the eigenvalues of A are 


real. 


Exercise 969 
Show that [) f(x)dx] < fy f(x)? dx for any f € C(O, 1). 


Exercise 970 
Let f :R—R be the constant function x + 1. Calculate || f|| when f is consid- 
ered as an element of C(O, 5) and compare it to || f||, when f is considered as 


an element of C(0, 77). 


Exercise 971 
Let V be an inner product space over R and let v, w € V satisfy ||[v + w|| = 
l|v|| + || w ||. Show that ||av + bw|| =allv|| + 5||w|| for allO<a,beER. 
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Exercise 972 

(Real polarization identity) Let V be an inner product space over R. Show that 
(u,v) = 4 (lu + v|/? — lw — v||) for all u,v € V. 

Exercise 973 

(Complex polarization identity) Let V be an inner product space over C. Show 


that (u,v) = 4 (lu + v|/? — ju — v |]? +i lu + iv]? — illu —iv||7) for all u, ve V. 


Exercise 974 

Let V = C(0,1) on which we have defined the inner product (f,g) = 
i i @)g(t) dt. Let W be the subspace of V generated by the function x +> x”. 
Find all elements of W normal with respect to this inner product. 


Exercise 975 
Let V be an inner product space over R and assume that v, w € V are nonzero 
vectors satisfying the condition (v, w) = ||v|| - || w||. Show that Ru = Rw. 


Exercise 976 

Let V be a vector space over R on which we have two inner products, jz and pu’ 
defined, which in turn define distance functions d and d’ respectively. If d =d’, 
does it necessarily follow that w= yu’? 


Exercise 977 
(Apollonius’ identity) Let V be an inner product space. Show that 


1 1 2 
jn — wi? + [lv — wl? = 5 lla — ni +2) 50+) — w 


forallu,v,weV. 


The Greek geometer Apollonius of Perga, who worked in Alexandria 
in the third century BC, in his famous book Conics, was the first to 


2 66 


introduce the terms “hyperbola”, “parabola”, and “ellipse”. 


Fae) 


Exercise 978 

Let n be a positive integer and let || - || be a norm defined on C”. For each 
Ae Maxn(C©), let ||A|| be the spectral norm of A. If A € Myrxy(C) is non- 
singular, show that every singular matrix B € My xn(C) satisfies || A — B|| > 
|| A~!||7!. Does there necessarily exist a singular matrix B for which equality 
holds? 
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Exercise 979 
Let V be an inner product space over R and consider the function 0: V x V x 
V = R defined by 


O(v,w, y) = llutwt yl? + llu+w- yl? —v—w—yl?-u-wt yl. 


Show that, for any v, w, y € V, the value of 6(v, w, y) does not depend on y. 


Exercise 980 
Let V be an inner product space over R and let n > 2. Let 9: V” — V be the 
U1 
function defined by 0: } : |b i 1 vi. Show that 
Un 
2 2 
n Ul n Ul 
: 2 
Yo fuel] : =) 5 lull? —n 
i=1 Un i=1 Un 


Exercise 981 
Let V be an inner product space over R and let v and w be nonzero vectors 
in V. Show that |(v, w)|? = (v, v)(w, w) if and only if the set {v, w} is linearly 
dependent. 


Exercise 982 

Let V be a finitely-generated inner product space and let B = {v),..., v,} bea 
set of vectors in V. Show that B is linearly dependent if and only if its Gram 
matrix is singular. 


Exercise 983 
Let V be a vector space over R and let || - || be a norm defined on V. Show that 
Illul] — wll] < lv — w]| forall vu, w eV. 


Exercise 984 

Let V be an inner product space finitely generated over R and let 5 € D(V). Pick 
vo € V. Show that for each real number e > 0 there exists a real number d > 0 
such that |6(v) — 5(vo)| < e whenever ||v — vg|| <d. 


Exercise 985 
Let V be an inner product space. For any u, v, w € V, show that 


0 1 1 1 
1 0 d(u,v)* d(u,w)? 

1 d(u,v)? 0: -doxane| =" 
1 d(tu,w)* d(v,w)* 0 
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Exercise 986 
Let n > 1. Show that there is no norm v + ||v|| defined on C” satisfying 


Allg =sup( Sol | Oy #v eC") for all A € Mnxn(C). 


Exercise 987 
Let n > 1 be an integer. For each A € Myyn(C), let ||A|| = e(A), the spectral 
radius of A. Does this turn Myx» (C) into a normed space? 


Exercise 988 

Let V be the vector space of all continuous functions from the unit interval [0, 1] 
on the real line to R and for each f € V, set || f|| = i | f (| dt. Is || - || anorm 
on V? 


Exercise 989 
Let p > 2 be prime and let n be a positive integer. For each 1 <i <n, 
define w(i) = min{i — 1, p —i + 1}. Does the function GF(p)” — R defined 


ay 

by | : | 07, w(i)a; turn GF(p)” into a normed space? 
an 

Exercise 990 


Let 0 < p < 1 and let f : R” — R be the function defined by 
a n 1/p 

Pe) = le (ya . 
dn i=l 


Show that this is not a norm but does satisfy the inequality f(v + w) < 
20-P)/PL f(v) + f(w)] for all v, w € R”. 


Exercise 991 


Let Vj,..., V, be normed spaces over the same field and let V = Te | Vi- For 

each | <i <n, denote the norm defined on V; by || - ||; and define a function 
UI 

vt> |v|| from V to R by setting = >", llvilli. Is this a norm on V? 
Un 

Exercise 992 


Let V be a normed space and let Oy 4 up € V. Show that there exists a linear 
functional 0 € D(V) satisfying ||@|| = 1 and 6(vo) = || voll. 


Exercise 993 
Let n > 1 and let A € My x»(C). Show that there are infinitely-many other ma- 
trices in M;,x(C) having the same Gershgorin circles as A. 


366 15 Inner Product Spaces 


Exercise 994 

Let A = [ajj] € M2 2(C) and let K be the Cassini oval defined by A. Show that 
every point on the boundary of K is an eigenvalue of a matrix B € M2x2(C) 
defining the same Cassini oval. 


Exercise 995 
Let n > | be an integer and let A = [a;;] € Mnxn(R). For any e > 0, show that 
there exists a nonsingular matrix B € M,n(R) satisfying ||A — Bl|z <e. 


Exercise 996 
Let n be a positive integer and let A = [a;j] € Mnxn(C). Let f : R> Maxn(©) 
be defined by f : t > e’4. Show that the derivative of f is given by f’:t > Ae’4. 


Exercise 997 
Let F be a field and let n be a positive integer. Let a € End(F”) and let V be a 


subspace of F disjoint from ker(a). If || - || denotes the Hamming norm on F”, 
is it necessarily true that ||v|| = ||a(v)|| for all v e V? 
Exercise 998 


Let n be a positive integer and let w be an endomorphism of C” represented 
with respect to the canonical basis by a matrix A € M,,(C). Then the canon- 
ical inner product on C” defines norms on C” and on End(C”). Show that 
p(A) < |la*||!/* for any integer k > 0. 


Exercise 999 
Let V and W be normed spaces over R or C and let a: V > W be a linear 
transformation for which ||a|| exists. Show that D(a) satisfies || D(a) || = |la|]. 


Exercise 1000 

Let V be a vector space finitely-generated over a field F and let L be the set of 
all subspaces of V. For W, Y € L, define d(W, Y) = dim(W + Y) —dim(WN1Y). 
Does this function satisfy the conditions of Proposition 15.12? 


Exercise 1001 
1 


1 
For each real number ft, set A(t)=] 1 O 1 | €. M3 x3(R). Does there exist a 
1 1 ¢ 
value of t for which || A(t) ||¥ = ||A(@|l2? 


Exercise 1002 


t 1 0 0 
1 ¢ 0 0 : 1 

For each ¢ > 0, let f(t) = OO. 1 ¥ . Calculate limy-,o0 7 f(t). 
0 0 ft 1 
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Exercise 1003 


Let V be a vector space over R and let Y be its complexification. Define a func- 
tion uw: Y x Y—> Cby 


be: (a |. |) H (v1, wi) + (v2, w2) +i[ (vw, w1) — (v1, w2) J. 


v2 w2 
al 
U2 
Exercise 1004 


Let V be a normed space and let a € End(V) have an induced norm satisfying 
||a|| < 1. Show that 0, —a@ € Aut(V). 


Show that jz is an inner product on Y and calculate 


for each | ey. 
v2 


Exercise 1005 

Let V be the set of all “infinite matrices” A = [a;;], where a;; € R for alli, j > 0, 
which is a vector space over R with addition and scalar multiplication defined 
elementwise. Let p > 1 be a real number and let qg be a real number satisfying 
; + ; = |. Let W be the subset of V consisting of all those matrices A satisfying 
the condition that 77° [092 |aij|41?/4 is finite. Show that W is a subspace of 
V and that the function At Os De |aj;|71?/7)'/? is a norm on W. 


Orthogonality 1 6 


Let V be an inner product space and let Oy ¥ v, w € V. From Proposition 15.2 we 
see that 
ke (v, w) + (w,v) 
2|lull- wih 


and so there exists a real number 0 < t < z satisfying 


(v, w) + (w, v) 


cos(t) = 
- 2\lull- wil 


This number ¢ is the angle between v and w. Note that if we are working over R, 
then 
(v, w) 


cos(t) = ————_.. 
" Ilull - wl 


Example If V = R” is endowed with the dot product, and if Oy 4 v, w € V then, 
using analytic geometry, it is easy to show that the angle as defined here is indeed 
the angle between the straight line determined by v and the origin, and the straight 
line determined by w and the origin. If we define different inner products on V, we 
build in this manner various non-Euclidean geometries in n-space. 


Example Let V = C(O, 1), on which we have defined the inner product (f, g) = 
lk f (x) g(x) dx. In particular, consider the functions f : x +> 5x? and g: xb 3x. 
Then || f || = 5 and llgll = J/3, and the angle t between f and g satisfies cos(t) = 


Vb Jo Ox29Gx) dx = VIS. 


Vectors v and w in an inner product space V are orthogonal if and only if 
(v, w) = 0. In this case, we write v L w. We note that if v L w then ||v + w||? = 
\|v||7-+ (v, w) + (w, v) +] wl]? = |]v||? + ||w]]?. A nonempty subset D of V isa set of 
mutually orthogonal vectors if v  w whenever v 4 w in D. If {v1,..., U,} is a mu- 
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tually orthogonal set of vectors in V then one shows, similarly, that || )°/_, vj \|? = 


diet Mell’. 


Example We have already seen that if v, w € R?, then v - (v x w) = 0. This says 
that a vector v is orthogonal to v x w, for any vector w. The same is also true for w 
and v x w and so we see that if {v, w} is a linearly-independent subset of IR? then 
the set {v, w, v x w} is linearly independent and so is a basis of R*. 


Moreover, as an immediate consequence of the Lagrange identity on R*, we see 


0 0 0 
thatifvx w= | 0 | andv-w=Otheneitherv=]| 0 | orw=| 0 |. Ifv, we R’, 
0 0 0 


then the angle t between them satisfies the condition that v- w = (||v|| - ||w||) cos(t). 
Using the Lagrange identity, we see that 


2 Qa t2 2 yay t2 2 
lv x wi}? = lull? lw — (v- w)* = lull wil" [1 — cos” (0)] 
Want? ein2 
= |lul|"llwll* sin’), 


vxw|| 


I 
Holl-lell° 


and so ||v x w|| = (|u| - || w|[)| sin@)|. Thus | cos(t)| = 


Example If V = C? on which we have the dot product, then it is easy to see that 
2+ 3i 1 1+i 
—1+5i -i | 
Example Let V = C(-—1, 1), on which we have defined the inner product ( f, g) = 


fe Ff (x)g(x) dx. For all i > 0, define the functions p; € V as follows: po: x b> 1; 
Pi:xte x; and 


2h+1 ( ) h ( ) 1 
: a = = henever h> 1. 
Ph+1:X b> i I XPn\x i Hl Ph-\\X WwW 


These polynomial functions are known as Legendre polynomials. It is easy to verify 
that p; L p;, whenever i <h. 
On the same space, we can define another inner product, namely 


1 
_ fl £@s@ 1. 
-1 V1—x? 


(f, 8) 


For each i > 0, define the function g; € V by setting gg: xt 1; qi: xt x; and 
Gh41 2X +» 2xqgn(x) — qn—1(x) whenever h > 1. These polynomial functions are 
known as Chebyshev polynomials. It is again easy to verify that q; -_ gn whenever 
ith. 

Both of the these products are special instances of a more general construction. 
For any —1 <r,s €R, it is possible to define an inner product on C(—1, 1) by 
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setting (f, g) = fo f()g(x)d — x)" + x)* dx. The set of polynomial functions 
which are mutually orthogonal with respect to this inner product is called the set of 
Jacobi polynomials of type (r,s). Such polynomials are important in many areas of 
numerical analysis, and in particular in numerical integration. 


With kind permission of the Bibliothéque de I’Institut de 
France (Legendre). 

Adrien-Marie Legendre was one of the first-rate 
mathematicians who worked in France during the 
time of the revolution and the generation after it. 
Among other things, he served on the committee 
that defined the metric system. Pafnuty Lvovich 
Chebyshey, a nineteenth-century Russian math- 
ematician, made important contributions to both 
pure and applied mathematics. 


Proposition 16.1 Let V be an inner product space over a field of scalars F . 

(1) IfveV satisfies v L w for all w € V, thenv =O0y. 

(2) If O@ #4 ACV and if v € V satisfies the condition that v 1 w for all 
wéA,thenv L_w forwe FA. 


Proof (1) is an immediate consequence of the fact that if u # Oy then (v, v) 4 0. 
Now assume that @ 4 A C V and that v 1 w for all w € A. If y € FA then there 
exist elements w),..., Wn € A and scalars a},..., dy such that y = )~_, a;w; and 
so (v, y) = )-7_, 4i(v, w;) =0, whence v L y. 


Proposition 16.2 Let V be an inner product space and let A be a nonempty 
set of nonzero mutually-orthogonal vectors in V. Then A is linearly indepen- 
dent. 


Proof Let {v1,..., Un} be a finite subset of A and assume that there exist scalars 
C],...,Cy Such that ys cjvj = Oy. Then, for 1 < h <n, we have cy (vp, vn) = 
ye Ci (Ui, Un) = (YOFLy CiU;, Un) = (Oy, vp) = 0 and hence cp, = 0. Thus any finite 
subset of A is linearly independent, and therefore A is linearly independent. 


If V is an inner product space then any vector Oy 4 w € V defines a function 


Ty ivr 


from V to itself, which is in fact a projection the image of which is the subspace 
of V generated by {w}. This easily-checked remark is the basis for the following 
theorem. 
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Proposition 16.3 (Gram-Schmidt Theorem) Any finitely-generated inner 
product space V has a basis composed of mutually-orthogonal vectors. 


Proof We will proceed by induction on dim(V). If dim(V) = 1 the result is imme- 
diate. Therefore, we can assume that the proposition is true for any inner prod- 
uct space of dimension k, and assume that dim(V) = k + 1. Let W be a sub- 
space of V of dimension k. By the induction hypothesis, there exists a basis 


{v1,..., Ue} of W composed of mutually-orthogonal vectors. Let ve V \ W and 
set Up4] = V— ae Ily; (v). This vector does not belong to W since v ¢ W. There- 
fore, {v,,..., Ug41} iS a generating set for V. Moreover, for | < j <k, we have 


fetes Stas 0 


and so vg4, Lv; for all 1 < j < k. By Proposition 5.3, it follows that the set 
{v,,..., Ue } is linearly independent and so is a basis for V. 


We should note that the proof of Proposition 16.3 is an algorithm, called the 
Gram-—Schmidt process, which is easy to implement by a computer program to create 
a basis composed of mutually-orthogonal vectors of V, when we are given a basis 
of any sort for the space. 


3 0 3 

0 1 -1 : 4 
Example Let v1, = 0° v2 =| 5 |> and v3 = 3 be vectors in R*, on 

0 1 2 


which we have defined the dot product. The set {v1, v2, v3} is linearly independent 
and so generates a three-dimensional subspace W of R*. Let us use the Gram— 
Schmidt process to build a basis for W composed of mutually-orthogonal vectors. 


0 
Indeed, we define uj = vj, u2 = v2 — My, (v2) = ; , and u3 = v3 — Ty, (v3) — 
1 
0 
Tu, (V3) = P 7) then {uv 1, u2,u3} is a basis for W, the vectors of which are 
5 


mutually orthogonal. 


Example Let V = C(—1, 1), on which we have defined the inner product (f, g) = 
cs f (x)g(x) dx. For all i > 0, let f; be the polynomial function fj : x x’. Then 
for each n > 0, the set { fo, ..., fn} is linearly independent and so forms a basis for a 
subspace W of V. We now apply the Gram—Schmidt process to this basis, to obtain 
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a basis {po,---, Pn} Of vectors in V which are mutually orthogonal, where the p; 
are precisely the Legendre polynomials we introduced earlier. 


Example If a is a diagonalizable endomorphism of a finitely-generated inner prod- 
uct space V then there exists a basis B composed of eigenvectors of a. Applying 
the Gram-—Schmidt process to B will yield a basis of V composed of mutually- 
orthogonal vectors, but they may no longer be eigenvectors of a. Thus, for exam- 


F _ mp2: : a a+b : _ 1 1 _ 
ple if v= Rita: | | | » ]-anair a ={] j 4 , then the Gram. 


0 


Schmidt process yields ; ly , where H is not an eigenvector of a. 


Actually, the assumption that we have a basis in hand when initiating the Gram— 
Schmidt process is one of convenience rather than necessity. We could begin with 
an arbitrary generating set {v1,..., U,} for the given space. In that case, at the hth 
stage of the process we would begin by checking whether vy is a linear combination 
of the set of mutually-orthogonal vectors {uw 1, ..., 4,—1} we have already created. If 
it is, we just discard it and go on to vp+1. 

We should point out that the Gram—Schmidt process is not considered computa- 
tionally stable—small errors and roundoffs in the computational process accumulate 
rapidly and can lead at the end to a significant difference between the true solution 
and the computed solution. There are, fortunately, other more sophisticated meth- 
ods of constructing a basis composed of mutually-orthogonal vectors from a given 
basis. 


Proposition 16.4 (Hadamard inequality) Let n be a positive integer, let A= 
Laij] € Mnxn(R) be a nonsingular matrix, and let e = |A|. Then |e| < g"Jn", 
where g = max{|a;j|| 1 <i, j <n}. 


Proof Denote the rows of A by v1,..., U,. Then {v1,..., v,} is a basis for V = R” 
and so, using the Gram—Schmidt method, we can find a new basis {u1,..., Un} 
for V, on which we consider the dot product, composed of mutually-orthogonal 
vectors, and defined by setting uw; = vy and up = vp — ae Cpjuj, where chj = 


(Up Uj); uj). If B € Myxn(R) is the matrix the rows of which are u,,..., Un, 
1 0 O oe 0) 
C21 1 0 tee 0 

then A= CB, where C = c31 32, --» 01]. Since C is a lower- 
Chi Cn2 «++ Cnn—-1 1 


triangular matrix, its determinant is the product of the entries on its diagonal, 
namely |. Therefore e = |B]. 
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By looking at the Gram—Schmidt method, we see that ||u;|| < ||v;|| for all 
1 <i <n. Moreover, since the u; are mutually orthogonal, we see that BB’ =D, 
where D = [dj;] is the diagonal matrix defined by dj; = ||u; \| for all 1 <i <n. 
Therefore, e? = |BB"| = |D| = []}_, lluill’. Now let g = max{|qjj| | 1 <i, j < 
n}. Then |lu;|| < lull] < g/n for all 1 <i <n and so |e| < g"Vn", as de- 
sired. 


Let V be an inner product space having a subspace W. Let Wt = {v Ee V 
(v, w) = 0 for all w € W}. By Proposition 16.1, we know that W+ is a subspace 
of V. Since (v, v) £0 for all Oy 4 v € V, it is clear that W and W+ are disjoint. 
Also, again by Proposition 16.1, we see that vi= {Ov} and {Oy}+ = V. The space 
W+ is called the orthogonal complement of W in V, and this name is justified by 
the following result: 


Proposition 16.5 Let W be a subspace of a finitely-generated inner product 
space V. Then V = W ® Wt and W = (W*)+. Moreover, if Y is another 
subspace of V then WtnYt=(W+Y)t. 


Proof By Proposition 16.3, we know that it is possible to find a basis {v;,..., vx} 
of W which is composed of mutually-orthogonal vectors, and by the construction 
method used in the proof of this proposition, we see that this can be extended to a 
basis {v,,...,U,} of V, the elements of which are still mutually orthogonal. Thus 
v; € W+ for all k <i <n, proving that V = W + W+. But we already know that W 
and W+ are disjoint and so we have W ® W. Moreover, {Uk+1,--+, Un} iS a basis 
for W+ and so W =(W")t. 

Now let Y be another subspace of V. If ve¢ W+ Y~ then, for each w € Y 
and y € Y we have (v,w + y) = (v,w) + (v, y) = 0 and sove (W+Y)-. 
Conversely, if v € (W + Y)+ then (v, w) = (v, y) =0 for all w € W and yeY, 
soveWtnyt. 


In particular, if V is an inner product space having a subspace W then we have a 
natural projection of W @ W+ onto W, called the orthogonal projection. The image 
of a vector v € W @ W+ under this projection is the unique element of W closest to 
v, according to the distance function defined by the inner product on V, in the sense 
of the following theorem. 


Proposition 16.6 Let W be a subspace of an inner product space V and let 
v=w-+y, where w € W and y € Wt. Then |\v — w’|| > ||v — wl for all 
w’ € W, with equality holding if and only if w' = w. 
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Proof If w' € W then 
lv — w'||* = |lw—w' + yl? =(w—-w’ +y,w—w'+y) 
=(w—w',w—w') +(y,w—w’) + (w—w’, y) + (y, y) 
=(w—w',w—w') +(y,y) 


2 2 2 2 
= |[w—w' | + lly = ww" + [lv — wl, 


and from here the result follows immediately. 


One of the important problems in computational algebra is the following: Given 
an endomorphism a of a finitely-generated inner product space and a vector Oy 4 
vo € V, find an efficient procedure to define an orthogonal projection onto the 
Krylov subspace F[a]vo. One of the first of these is the Arnoldi process, a mod- 
ification of the Gram—Schmidt process. Several variants of this procedure have been 
devised, depending on special properties of a. This process is not considered as 
computationally efficient as the Lanczos algorithm mentioned earlier. Arnoldi’s pro- 
cess is also the basis for the GMRES algorithm (GMRES = generalized minimal 
residual) for solution of systems of linear equations, devised by Yousef Saad and 
Martin Schultz in 1986. 


© Y. Saad (Saad); © Martin Schultz (Schultz). 


Algerian/American Yousef Saad and American 
Martin H. Schultz are contemporary computer 
scientists. Walter Edward Arnoldi was a twenti- 
eth century American engineer whose career was 
mostly spent with United Aircraft Corporation. 


Note that Proposition 16.5 is not necessarily true if the space V is not finitely 
generated, as the following example shows. 


Example Let V = R©). For each h > 0, let vp, be the sequence in which the th 
entry equals | and all other entries equal 0. Then B = {vz | h => 0} is a basis for V 
composed of mutually-orthogonal vectors. Let W = R{vp — vj, vj — v2,...}. This 
subspace of V is proper since v9 € V \ W. If Oy 4 y € W* then there exists a 
nonnegative integer n such that y = )~_, a;v;, where the a; are real numbers and 
ay, ~ 0. But then ay = (y, Vp — Un+1) = 0, and that is a contradiction. Therefore, we 
have shown that W1 = {Oy}, despite the fact that W 4 V. Moreover, in this case 
VAWOW! and (W+)t=V4W. 


Let V be an inner product space. A nonempty subset A of V is orthonormal if 
and only if the elements of A are mutually orthogonal, and each of them is normal. 
Thus, for example, the canonical basis of R”, equipped with the dot product, is 
orthonormal. 
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oe Let V = C(—z,7), on which we have an inner product defined by 
(f, g) es -fr, J (x) g(x) dx. Then we have an orthonormal subset { a} U{sin(nx) | 
n> Ti teoe as |n> lof V. 


Example Let V be the subspace of R™ consisting of all functions f for which 
f base °° |f(x)|? dx is finite, where the norm is taken with respect to the inner prod- 
uct (f, g) = ie Ff (x)g(x) dx defined on V. Let h € V be the function defined by 
1 forO<x< 5 
hixr4-1 for 5 <x <1, 
0 otherwise. 


This function is known as the Haar wavelet. For each j,k € N define the function 
hi € V by setting hj : x +> 2//*h(2/x — k). Then the subset {hj | j,k € N} of V is 
orthonormal. Haar wavelets have important applications in image compression. 


primarily in analysis. 


—_~ The twentieth-century Hungarian mathematician Alfréd Haar worked 


3 


"9 - 


Proposition 16.7 Every finitely-generated inner product space V has an or- 
thonormal basis. 


Proof By Proposition 16.3, we know that V has a basis {v1,..., U,} the elements 
of which are mutually orthogonal. For each 1 <i <n, let w; = |lv; |! u;. Then 
each w; is normal and {w ,..., w,} is a basis for V, the elements of which remain 


mutually orthogonal. 


We can modify the Gram—Schmidt method to provide an algorithm for construct- 
ing an orthonormal basis from any given basis of a finitely-generated inner product 
space V, by normalizing each basis element as it is created. This has the added 
advantage of tending to reduce accumulated roundoff and truncation errors. The ex- 
amples after Proposition 16.3 and Proposition 16.6 show that inner product spaces 
which are not finitely generated may have orthonormal bases as well, but this is 
not always true. Making use of the Hausdorff Maximum Principle, it is possible to 
show that every inner product space V has a maximal orthonormal set, which must 
be linearly independent by Proposition 16.2. Such a subset is called a Hilbert subset 
of V. Clearly, a subset A of V is a Hilbert subset if and only if for every Oy 4 y € V 
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there exists a v € A satisfying (uv, y) £0. If V is finitely generated then any Hilbert 
subset of V is a basis for V, but this is not necessarily true for inner product spaces 
which are not finitely generated. 


Example Let V be the set of all infinite sequences cg, c,,... of complex numbers 
satisfying the condition that }°?° |ci| < 00. We have already seen that this is an 
inner product space. For each k > 0, let v; be the sequence co, cj, ... in which 


_ fl ifisk, 
“i=)0 otherwise. 


Then {vx | k > 0} is a Hilbert subset of V which is not a basis for V. 


Example It is, of course, possible that a finitely-generated inner product space may 
have many different orthonormal bases. For example, the canonical basis of R* is 
orthonormal, as is the basis 


i) 


We can use Proposition 16.7 to verify the assertion made in the previous chapter. 


Proposition 16.8 If V and W are inner product spaces over the same field 
F,, and if V is finitely generated, then ||a|\ is finite for all a ¢ Hom(V, W). 


Proof Pick an orthonormal basis {vj,..., U,} for V and let a e Hom(V, W). If 
v € )-?_, aju; is normal, then 1 = |[v||? = (v,v) = 7, a1 G4 j (Vi, Vj) = 
>-_, lail? and so |a;| < 1 for each 1 <i <n. Therefore, ||a(v)|| < 37.) ail - 
|x (vj) || < )7_, la (v;)||. Thus |||] is finite and less than c = }7""_, |lo(v;)]]. 


Example Of course, ||a|| may be finite even when V is not finitely generated. For 
example, let V = C(O, 1) and define a norm on V by setting || || = max{|f (4) 
0 <t <1}. Let g: [0,1] x [0, 1] ~ R be a continuous function. Let a be the en- 
domorphism of V defined by a(f) : tthe i g(t, s) f(s)ds. Since g is continuous 
on a closed subset of R, we note that it is bounded there, say |g(t,5)| <c for all 
0 <t,s < 1. Moreover, | f(s)| < max{|f(t)| |O<t< 1}=||f]| foralO<s <1, 
and so 


1 
[ se.nforas 0 <i 
0 


|a(/)| = max{ 


1 
<ma{ [ lg(t.s)|-|f(s)| ds 


o<rsi}<elsi 


forall feV. 
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Proposition 16.9 Let V be an inner product space having an orthonormal 
basis B. If v=o <p ayy and w => <p byy are vectors in V (where only 
finitely-many of the ay and by are nonzero), then (v, w) = ae yDy. 


Proof By the properties of the inner product, we have 


(v,w) = ( ayy, » b.3] 1 2 ~ ayby(y, x)= > GgBiys 


yeB xeB yeBxeB yeB 


Proposition 16.10 Let V be an inner product space having an orthonormal 
basis B. Then each v € V satisfies v =) .eR (UV, X)x. 


Proof We know that there exist scalars {a, | x € B}, only finitely-many of which 
are nonzero, such that v = ))..,4,x. Then for each x € B we have (v,x) = 
yep Ay) x) = yer dy (y,X) = ay (x,X) = a,, which yields the desired result. 


The coefficients (v, vj) encountered in Proposition 16.10 are called the Fourier 
coefficients of the vector v with respect to the given orthonormal basis. 


Example Consider the vector space C(—1, 1) over R, on which we have the inner 
product (f, g) = ve , f(x)g(x) dx. We want to find a polynomial function of de- 
gree at most 3 which most closely approximates the function f : x +> sin(x) on the 
interval [—1, 1]. To do so, consider the subspace V of C(—1, 1) generated by the 
functions pj :xt> x! for 0 <i <3 and f. Apply the Gram—Schmidt process to 
the basis {po,..., p3, f} of V to get an orthonormal basis {go,...,q3, g}, where 
qoixh 5341 sae yf Bar pia /§G.?— 4); and q3:x > \/23x3 - 2x). 
By Proposition 16.6 and Proposition 16.10, we know that the polynomial function of 
degree at most 3 which most closely approximates the function f is yon SGU) 4is 
where the Fourier coefficients (f, gi) are given by 


1 
(f, qo) = / suede = 0; 
42 


1 
(f,q1) = / 4/ ; sin(x)x dx = V6(sin(1) = cos(1)) = 0.738; 
-1 


Tf Ses WW. pase 
(fran) =f (3 - 5) sinc) x = 0; 
do fee 8 
(f, w=f fi(5 - 1) sin(x) dx 


= 14/14c0s(1) — 9V/14 sin(1) = —0.034. 
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Thus the polynomial function we seek is given by wu: x > —0.315x? + 0.998x. 


Proposition 16.11 Let F be R or C and let k and n be positive integers. 
Let A € Mkxn(F) be a matrix the columns of which are linearly independent 
in F*. Then there exist matrices QO © Mgxn(F) and R € Mnxn(F) such that 
(1) A=QR; 

(2) The columns of Q are orthonormal with respect to the dot product on F k. 
(3) R is nonsingular and upper-triangular. 


Proof Let uj,...,Un be the columns of A. Apply the Gram—Schmidt process to 


the set {u1,...,u,} and then normalize each of the resulting vectors to obtain 
an orthonormal set {v1,...,U,} of vectors in F*, Let QO € Mkxn(F) be the ma- 
trix having columns vj,...,v,. Then, by Proposition 16.10, we see that u; = 


Via ui -vj)v; for all 1 <i <n, and so A= QR, where R = [rij] € Maxn(F) 
is given by rj; =u; - vu; for all 1 <i, j <n. This matrix is clearly nonsingular. 
Moreover, we note that the Gram—Schmidt process is such that v; is orthogonal 
to uj,...,uj—1 for all 2 < j <n and so rj; = 0 when i > j. Therefore, R is also 
upper-triangular. 


A factorization of a matrix in the form given by Proposition 16.11 is called a 
QR-decomposition. Such decompositions form a basis of many important numerical 
algorithms, and are widely used, for example, in computing eigenvalues of large 
matrices. The use is primarily iterative. If A is an n x n matrix over R or C the 
eigenvalues of which have distinct absolute values and if we can indefinitely perform 
the iteration 


(1) Al =A; 
(2) If Aj has a QR-decomposition A; = Q; R; then set Aj; = R; Q;; then, under 
rather mild conditions on A, the sequence Aj, A2,... of matrices tends to an 


upper triangular matrix in which the eigenvalues of A appear in decreasing order 
of absolute value along the diagonal. 


© Walter Gander (Rutishauser); © 
Vera Kublanovskaya (Kublanovskaya); 
© Frank Uhlig (Francis). 
QR-decompositions were de- 
veloped independently by the 
Swiss computer scientist Heinz 
Rutishauser, one of the fa- 
thers of ALGOL, by the Rus- 
sian computer scientist Vera 
Kublanovskaya, and by John G.F. Francis of the British computer manufacturer Fer- 
ranti Ltd. 


One of the major advantages of QR-decompositions is that they are easy to up- 
date. If we are given a decomposition A = QR, and then the matrix A is altered 
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slightly to obtain a matrix A’ by changing a few of its entries, it is relatively easy 
to alter Q and R to get a QR-decomposition for A’. This is important since many 
applications of linear algebra involve solving successive systems of linear equations 
of the form A? X = w™, where A“T) and w“* are obtained from A® and w 
by relatively minor modifications, based on data from some external source which 
is periodically updated. 

The following QR-algorithm is used to compute a QR-decomposition of a matrix 
AEM yn(F) with columns u1,..., Un: 


For i = 1 ton do steps (1)-(3): 


(1) v= uj; 
(2) For j=1toi—Isetrj; =u; - 0; and Uj = Ui — TiV;s 
(3) Set rij = |ly;|| and vj = rj! yj. 
Then @Q is the matrix with columns vj,...,v, and R = [r;;]. Note that step 


(3) presupposes that we have already checked that the set of columns of A is lin- 
early independent. If not, then we have to add an initial check to insure that r;; 
is nonzero, before we attempt to invert it. As already noted, the Gram—Schmidt 
method is not numerically stable and hence neither is this algorithm for finding a 
QR-decomposition. It can be modified to produce a somewhat more stable algo- 
rithm by replacing the definition of r;; in step (2) by rjj = vj; - vj. 

A variant on this algorithm, called the QZ algorithm, has been devised by Moler 
and Stewart to find solutions for generalized eigenvalue problems. 


With kind permission of The MathWorks, Inc. (Moler); 
© Eric de Sturler (Stewart). 

The contemporary American computer scientist 
Cleve Moler, after a distinguished academic ca- 
reer, became chairman and chief scientist of 
MathWorks, the company that developed MAT- 
LAB. G.W. Stewart is a contemporary American 
computer scientist. 


Proposition 16.12 Let V be an inner product space having an orthonormal 
basis B. Then for allv,w éV: 


(1) (Parseval’s identity) (v, w) = ven (y, v) (y, w); 
(2) (Bessel’s identity) Il v ||? = Dosee l(y, v)|?. 


Proof Parseval’s identity follows from the calculation 


(v, w) = (Sw. yy, u) = >o(v, yy, w) = 2G, v)(y, w); 


yeB yeB yeB 


Bessel’s identity derives from this in the special case v = w. 
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With kind permission of the Leibniz-Institut fiir Astrophysik Potsdam. 


Wilhelm Bessel was a nineteenth century astronomer and a friend of 
Gauss; his mathematical work came as a result of his research on plan- 
etary orbits. The French mathematician Marc-Antoine Parseval pub- 
lished only five short papers at the end of the eighteenth century. 


The following results shows that orthogonality can be used to determine the re- 
lation between two different inner products defined on a vector space over R. 


Proposition 16.13 Let V be a vector space over R on which we have de- 
fined two inner products, 4, and [42. Fori = 1,2, let Y; ={(v,w) eV x V| 
[i (v, w) = 0}. Then the following conditions are equivalent: 

(1) There exists a positive real number c such that 12 = C71; 

(2) Y1=Y2; 

(3) ¥1€ ¥. 


Proof Since it is clear that (1) implies (2), and (2) implies (3), all we have to 
prove is that (3) implies (1). Therefore, assume (3). First, let us consider the 
case dim(V) = 1, i.e., the case in which V = R. Then, for i = 1,2, the scalar 
bj = wii, 1) is nonzero. Set d = w2(1, 1)/uid, 1). If a,b € R, then p2(a, b) = 
abu2(1, 1) = abdy,(1, 1) = duy(a, b) and so, taking c = Jd, we have estab- 
lished (1). Thus we can assume that dim(V) > 2. For each i = 1,2, andeachve V, 
let |u|]; = /i(v, v) > 0. Without loss of generality, we can assume that there ex- 
ist elements v, w € V and positive real numbers a < b such that ||v||2 = allv|l1 
and ||w||2 = b||w||1, since otherwise we would immediately have (1). Suppose that 
w = dv for some 04d ER. Then ||w||2 = |d| - |lv|l2 = [dla - ||v||,a|| wl, and so 
a = b, which is contrary to our assumption that a < b. Therefore, we conclude that 
the set {v, w} is linearly independent. Normalizing v and w with respect to py, if 
necessary, we can furthermore assume that ||v||; = 1 = ||w|l1. 

We claim that (v,w) ¢ Y;. Indeed, assume otherwise. Then we have 
wiv + w,v — w) = 1 (v, v) — 1 (w, w) = |lv||} — ||w||} = 0, which implies 
(v+w,v—w) € Y. Therefore, by (3), (v + w,v — w) € Y2. But then we have 
L2(u + w,v- w) = Ilvll5 — ale = a* — b* € RX {0}, yielding a contradic- 
tion and establishing the claim. Set y = v — IlvllFur(v, w)~!w. Then p(y, v) = 
Ilvll7 — rpy(v,w) = 0, where r = lvlifui(v, wt, and so (y,v) € Yj © Yo. 
Since the set {v,w} is linearly independent, we know that y ¥ Oy. If we set 
y= lIylly)y. then (y’,v) € ¥; © Yo and so, as before, wi (y’ + v, y’ — v) = 
Ily’lt — lull] = 0. Hence (y’ + v, y! — v) € Yy. Thus |ly|l7 + llullf = | -— y + 
vllT = Irwllt = lullflr@, w) wily and so |lylT = lull{lei(v, w)|-7 wilt — 
llvll}. Since w2(y, v) = 0, we see that |lylIZ + |lull = ll — y + ull} = Irwll} = 
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llulll41(v, w)|-? || wll3, Thus 


2 4 =2 2 2 
ly = lollf|erv, w)|“Hewlld — Hell 


—2)2 2 2 2 
|b? ||wllt — a7 [lvl] 


las 


4 
= ulimit, w) 


> @(llvlt|e1(v, w)| Twit = llellt) = @7 lly. 

Since 2(y’, v) = 0, this implies that w2(y’ + v, y’! — v) = |ly'|I5 — lull > a? — 
a? = 0, contradicting (3) and the fact that 21 (y’ + v, y’ — v) = 0. From this con- 
tradiction, we conclude that there can be no elements a and b as above, so there 
must exist a positive real number c such that ||v II5 =cllv It for each nonzero vector 
uv € V. Then for each v, w € V we have 


1 
H2(v, w) = glllv + wll - lv — wll] 


2 
c 
= qile+ wh —|lu—wlh] =n, w), 


which proves (1). 


Let V be an inner product space. We have already seen that, for each w € V, 
the function from Vto the field of scalars given by v+> (uv, w) belongs to D(V). If 
V is finitely generated, we claim that every element of D(V) is of this form. The 
following result is actually a special case of a much wider, and more complicated, 
theorem. 


Proposition 16.14 (Riesz Representation Theorem) Let V be a finitely- 
generated inner product space. If 5 € D(V) then there exists a unique vector 
y €V satisfying 5(v) = (v, y) forallv eV. 


Proof Let {v1,..., Un} be an orthonormal basis for V and let y = >*"_, d(yj) vj. 
Then for all 1 < h <n we have 


(Un, y) = (m Sam) = )25(vi){vn, vi) = 8 (vn), 
i=1 i=1 


and so (v, y) = 6(v) for all v € V. The vector y is unique since if (v, x) = (v, y) for 
all ve V then x = )7"_, (x, uj) uj = 0_, 6(uj) uj = y, as desired. 
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The twentieth century Hungarian mathematician Frigyes Riesz was 
one of the founders of functional analysis. 


Example Let n > 1 be an integer and let V be the subspace of R® consisting of 
all polynomial functions of degree at most n, on which we have an inner product 


defined by (f, g) = ic f(@ag(t) dt. Let 6 € D(V) be the linear functional defined 
by 6: ft» f(O). By Proposition 16.14, there exists a polynomial function p € V 


satisfying the condition f(0) = fier f@)p(t)dt for all f € V. The function p is 
defined to be )7j_9 pi (0) pi, where p; is the ith Legendre polynomial. 


Proposition 16.15 Let V and W be finitely-generated inner product spaces, 
and let a: V — W be a linear transformation. Then there exists a unique 
linear transformation a* :W — V satisfying the condition (a(v),w) = 
(v,a*(w)) for allv € V andall we W. 


Proof Let w be a given vector in W. It is easy to check that the function 6 from 
V to F defined by 6: vt> (a(v), w) is a linear functional. By Proposition 16.14, 
we know that there exists a unique vector y, € V satisfying 6(v) = (v, yy) for all 
v € V. Define the function a* : W > V by a*: wh yy. We have to prove that this 
function is indeed a linear transformation. Indeed, if w;, w2 € W then 


(v, a* (wy + w2)) = («(v), wi + we) = ((v), wi) + (a(v), we) 
= (v, a*(w1)) + (v, a*(w2)) = (v, w*(w1) + @*(wa))}, 


and this is true for all v € V, so we have a*(w, + w2) = a*(w)) + a*(wz2) for all 
w 1, w2 € W. If c isa scalar and if w € W then 


(v, a*(cw)) = (a(v), cw) = C(a(v), w) = clu, a*(w)) = (v, ca*(w)) 


for all v € V, and hence a* (cw) = ca*(w). Thus a* is a linear transformation and, 
since yy is uniquely defined, it is also unique. 


Let V and W be inner product spaces and let a: V > W be a linear transfor- 
mation. A linear transformation a* : W — V satisfying the condition (a(v), w) = 
(v,a*(w)) for all v € V and w € W is called an adjoint transformation of a. If such 
an adjoint exists, it must be unique. Indeed, assume that a : V > W has adjoints a* 
and a™ and that there exists an element w’ € W satisfying a*(w’) 4 a* (w’). Set 
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v’ =a*(w’) — a* (w’). Then (v’, v’) = (v’, a*(w’)) — (v’, a* (w’)) = (a(v’), w’) — 
(a(v’), w’) = 0 and so v’ = Oy, which is a contradiction. 

By Proposition 16.15, we know that if V and W are finitely generated then every 
a € Hom(V, W) has an adjoint. 


Proposition 16.16 Let V and W be finitely-generated inner product spaces, 
having orthonormal bases B = {v1,...,U,} and D = {w,..., we}, respec- 
tively. Leta: V — W be alinear transformation. Then ® g p(a) is the matrix 
A =[ajj], where aj; = (a(v;), wj) and ®pg(a*) = A®, 


Proof For all 1 <i <7n, let a(vj) = iy anjWn. Then for all 1 < j < k we have 
(a(vj),wj) = (Sh-1 4nj Wh, Wj) =ajij and also (a*(w;), vj) = (uj,a*(wj)) = 
(a(v;), wj) = ji, as needed. 


Example It is, of course, possible that a linear transformation between inner product 
spaces can have an adjoint even if the spaces are not finitely generated. For exam- 
ple, let [a, b] be a closed interval on the real line and let V be the vector space of all 
differentiable functions from [a, b] to R. Define an inner product on V by setting 
(fAgi=f : Ff (x)g(x) dx. This is an inner product space which is not finitely gener- 
ated over R. Let a be the endomorphism of V satisfying a(f) : x he p? e'* f(t) dt. 
Then (a(f), g) = (f, a(g)) for all f, g € V, and so a* exists, and equals a. 


Proposition 16.17 Let V, W, and Y be inner product spaces. Let a and B 
be linear transformations from V to W having adjoints, let ¢ be a linear 
transformation from W to Y having an adjoint, and let c be a scalar. Then: 
(1) (a+ B)* =a* + B*; 

(2) (ca)* = ca*; 

(3) Ga)* = a*o*; 


(4) a** =a. 


Proof (1) For all v € V and all w € W, we have 


(v, (w+ B)*(w)) = ((a + B)(v), w) = (a(v) + B(v), w) 
= (a(v), w) + (B(v), w) = (v, 0*(w)) + (v, B*(w)) 
= (v, (o* + B*)(w)), 


and so by the uniqueness of the adjoint we get (a + B)* =a* + p*. 
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(2) For all v € V and all w € W, we have 


(v, (cw)*(w)) = (co) (v), w) = (c(a(v)), w) = cla(v), w) 
— clu, a*(w)) = (v, ca* (w)) = (v, (ca*) (w)), 


and so (ca)* = ¢a*. 
(3) For all v € V and all y € Y, we have (v, (€a@)*(y)) = ((fa)(v), y) = 
(a(v), €*(y)) = (v, a*o*(y)), and so (Ga)* =a*e*. 
(4) For all v € V and all w € W, we have (w,a**(v)) = (a*(w),v) = 
(v,a*(w)) = (a(v), w) = (w,a(v)), and soa*™* =a. 


If (K, e) is an algebra over a field F, then a function at> a* from K to itself is 
an involution of K if and only if the following additional conditions are satisfied: 
(1) (a+ b)* =a* 4+ b*and (ae b)* = b* ea* foralla,be K; 

(2) a* =a forallae K. 

Note that 0* = (0 + 0)* = 0* + 0* and so 0* = 0. This means that if0 4a € K then 
0 4 a*, by (2). If K is unital, then 1 = 1** = (le 1l*)*=1* 0 l*=1le l*=1*. 
If this case, if a is a unit of K then (a~!)*a* = (aa~!)* = 1* = 1 and similarly 
a*(a~!)* = 1, so (a7!)* = (a*)7!7 

An element a of K is symmetric with respect to x if and only if a* =a. If be K 
is a unit symmetric with respect to * then it is straightforward to verify that the 
function a > b~!a*b is also an involution of K. 


Example If V is a finitely-generated inner product space, then we see that the func- 
tion a +> a” is an involution of the F-algebra End(V). Another involution we have 
already seen is the function A A! of the F -algebra Mnxn(F), for any field F. 
Of course, in the case F = R, the relation between these two involutions can be seen 
from Proposition 16.16. We have also seen that the function A+> A” is an involu- 
tion on M,,,.,(C), and its relation to the involution a +> a* is also immediate from 
Proposition 16.15. 


Example Let F be a field and let (K,e) be an F-algebra. Define an operation © 


on K by setting | © | = eel Then (K7,©) is an F-algebra and the 


: a b|. . : : 
function pl }q] is involution of this algebra. 


Proposition 16.18 Let a: V — W be a linear transformation between 
finitely-generated inner product spaces. Then: 

(1) ker(a*) =im(a)~; 
(2) ker(@) = im(a*)~; 
(3) im(a) = ker(a*)--; 
(4) im(a@*) = ker(a)~. 
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Proof (1) We note that 


ker(a*) = {w €Wla*(w)= Ov} 
{w ew (v,a*(w)) = 0 for all v € v} 


{w € W | (a(v), w) =0 for all ve V} =im(@@)t. 


(2) This follows from the same argument as (1), replacing a by a”. 
(3) By (1) and Proposition 16.6, we have im(a) = (im(w)+)+ = ker(a*)+. 
(4) This follows from (2) in the way (3) follows from (1). 


Proposition 16.19 [f a is an endomorphism of a finitely-generated inner 
product space V then null(a) = null(a*). 


Proof By Proposition 6.10 and Proposition 16.18, we see that null(@w) = 
dim(im(a*)+) = dim(V) — dim(im(a*)) = null(a*). 


Example Proposition 16.19 is not necessarily true for inner product spaces which 
are not finitely generated. For example, let V = R‘°) with the inner prod- 
uct ([a9,a1,...],[bo, b1,...]) = paaneae Let a € End(V) be given by a: 
[ao, a1,...] + [0, ao, ay,...]. Then a* exists and is given by a” : [ag, aj,...] 
[a1, a2, ...]. Clearly, ker(q) is trivial but ker(a*) is not. 


Proposition 16.20 Let a: V — W be a linear transformation between 
finitely-generated inner product spaces. Then 

(1) Ifa is amonomorphism then a*a is an automorphism of V; 

(2) Ifa is an epimorphism then aa* is an automorphism of W. 


Proof (1) It suffices to prove that the linear transformation a@*@ is monic. And, 
indeed, if v € V satisfies a*a(v) = Oy then (a(v), a(v)) = (a*a(v), v) = (Oy, v) = 
0 and so a(v) = Ow. Since a is a monomorphism, v = Oy and so we have shown 
that a*a is monic, as we needed. 

(2) First of all, we will show that w* is a monomorphism. Indeed, if w;, w2 € 
W are vectors satisfying a*(w ,) = a*(w2) then for all v € V we have (a(v), 
W, — W2) = (v,a*(w 1) — a*(w2)) = 0 and since @ is an epimorphism, we con- 
clude that (w,w; — w2) = 0 for all w © W. This implies that w; — w2 = Oy 
and so w, = w2, showing that a* is indeed monic. Now we will show that wa* 
is also monic, which will suffice to prove (2). Indeed, if aa*(w) = Ow then 
(a*(w), a*(w)) = (wa*(w), w) = 0 and so a*(w) = Oy, proving that w = Ow. 
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Proposition 16.21 Let a: V — W be an isomorphism between finitely- 
generated inner produce spaces. Then (a*)—! = (a7!)*. 


Proof Let B = (a—!)*. Then for all vj, v2 € V we have (v1, v2) = (a~!e(v1), v2) = 
(a(v1), B(v2)) = (v1, a@*B(v2)) and so a* B(v2) = v2 for all v2 € V, which means 


that a*B is the identity map on V. Thus 6 = (a*)~!. 


Finally, we mention a few consequences of some of the above results with which 


we will not deal at length, but have extensive and interesting discussions in the 
mathematical literature. 
(1) In an inner product space over R, we can also project onto affine subsets and 


(2 


wm 


not just onto subspaces. Indeed, if V is an inner product space over R then any 
element v of V defines a linear functional 5, € D(V) given by 5, : wt (w, v). 
If0¢c ER, then 6, ! (c) is an affine subset of V. Define a function 6, : V > V 
by setting 


Oy: yt y+ [| 
ir? 

Then for all y € V we have 6,6,(y) = c and so we see that 0)(y) € 6 1e), so 
that im(6,) C 65 1 (c). Moreover, 6? = 0,. We call the function 6, the projection 
on the affine set 5! (c). Such projections have many applications, such as the 
algebraic reconstruction technique (ART), which is very important in comput- 
erized imaging. 

From Proposition 16.5, we see that the rule which assigns to each subspace W of 
a finitely-generated inner product space V the orthogonal projection of V onto 
W is an embedding of the set of all subspaces of V into the algebra End(V). 
This observation has many ramifications, of which we mention but one. Let 
V be a finitely-generated inner product space. For subspaces W and Y of V, 
we can define the gap between W and Y to be g(W, Y) = ||zw — zy|l|, where 
mw and zy are the orthogonal projections of V onto W and Y, respectively. 
This allows us to measure the distance between subspaces of V in a natural 
way. One immediately sees that g(W, Y) = g(Wt, Y+) and that g(W,Y) <1 
for all such W and Y. Since the gap is a distance function—in the sense of 
Proposition 15.12—it turns the set of all subspaces of V into a metric space, 
the topological properties of which can be studied. For example, one can show 
that this space is compact and, as a result, also complete, meaning that every 
sequence W), W2,... of subspaces of V satisfying lim;, ; +00 g(Wi, Wj) =0 is 
convergent. It also makes sense to talk about continuous families of subspaces of 
V. The analysis of the topological space of all subspaces of a finite-dimensional 
inner product space has proven to be an extremely important tool. 
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Exercise 1006 
Let A= F 4 € M>x2(R). Does there exist a matrix B € M>,2(R) of the 
cos(t) —sin(t) 
io sin(t) cos(t) 
considered as elements of the space R*, endowed with the dot product? 


such that the columns of A + B are orthogonal, when 


Exercise 1007 


3 
Calculate the angle between the vectors | 1 | and 1 | in the space R?, en- 
—2 
dowed with the dot product. 
Exercise 1008 
1 1 
1 -1 
Calculate the angle between the vectors | 1 | and 1 | in the space R°, en- 
2 -1 
1 1 


dowed with the dot product. 


Exercise 1009 

Let n be a positive integer. A matrix A = [ajj] € Mnxn(C) is a complex 
Hadamard matrix if and only if |ay;| = 1 for all 1 <h, j <n and any pair of 
distinct rows of A, considered as vectors in C”, is orthogonal. For each n, find 
a complex number d such that A = [d"/] is a complex Hadamard matrix. For 
n = 6, find a complex Hadamard matrix which is not of this form. 


Exercise 1010 

Let A and B be nonempty subsets of R* which satisfy the condition that u x v € 
B whenever u € A and v€ B. Is it true that wu x w € Bt whenever u € A and 
we Bt. 


Exercise 1011 
Let V = C(0,1) on which we have defined the inner product (f,g) = 
i, FS (x) g(x) dx. Calculate || cos(t)||. 


Exercise 1012 
Let V = M2,x2(R) and define an inner product on V by setting 


a a b b ie 
ll 412 uu bia \\ 7 
(a ans re l)=D Data. 


i=1 j=1 
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Find the angle between the matrices | and ea | . 


Exercise 1013 
Let V be the space R*, together with the dot product. Find a normal vector in V 


1 1 2 
which is orthogonal to each of the vectors ii = , and ; 
1 1 3 


Exercise 1014 
Let V = R? on which some inner product is defined. Does there exist a vector 


2 1 2 
Oy 4v € V which is orthogonal to each of the vectors | 1 |, | —1 |, and] 0 |? 
3 1 4 


Exercise 1015 
Let f, g € R® be defined by f : xt x and g:xH x? 5. Are f and g orthog- 
onal as elements of C(0, 1)? Are they orthogonal as elements of C(0, 2)? 


Exercise 1016 
Find a real number c such that ||v — w|| = c for every orthonormal pair {v, w} of 
vectors in IR”, on which the dot product is defined. 


Exercise 1017 

Let n be a positive integer and let cj,...,¢c, is a list of real numbers. Let 
{v1,..., Un} be an orthonormal basis for R”, let d = min{c,,...,c,}. For each 
1 <i<n, set dj = Jc; —d and let w; = dv;. Let B € Mn xn(R) be the matrix 
the columns of which are d},...,d, and let A= BB? +dI. For 1 <i <n, show 
that v; is an eigenvector of A associated with the eigenvalue c;. 


Exercise 1018 
Let V = C(0,1) on which we have defined the inner product (f, g) = 
i f (x) g(x) dx, and let W = R{e*} C V. Find an infinite set of elements of W+. 


Exercise 1019 
Let V = R* on which some inner product is defined. Find distinct vectors 
v, w, y € V such that v L w and w L y, but not v L y. 


Exercise 1020 
Let V be an inner product space over R and let v and w be vectors in V. Show 
that ||v|| = || w|| if and only if (v+ w) L (v—w). 
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Exercise 1021 

Let V = C(—1,1) and define an inner product on V by setting (f,g) = 
ia Ff (x)g(x) dx. Let W be the subspace of V composed of all even functions. 
Find W+. 


Exercise 1022 

Let n be a positive integer and let V = M) x,(C), on which we have an inner 
product defined by (A, B) = tr(A? B). Let W be the subspace of V consisting of 
all those matrices A € V satisfying tr(A) = 0. Find W+. 


Exercise 1023 


Define an inner product on R* with respect to which the vectors |” | and 4 | 


are orthogonal. 


Exercise 1024 
Make use of the Gram—Schmidt process to find an orthonormal basis for the 
space RR? together with the dot product, beginning with the initial basis 


1 1 1 
Dhee | 2 a 2 
1 1 3 


Exercise 1025 

Let V be an inner product space and let A be an orthonormal subset of V. Show 
that A is a maximal orthonormal subset if and only if for every Oy ¥ y € V there 
exists au € A Satisfying (v, y) £0. 


Exercise 1026 

Let V be an inner product space of finite dimension n over its field of scalars. 
Show that there exists a subset {v1,..., v2} of V satisfying the conditions that 
(vj, vj) <O forall 1<iAj <2n. 


Exercise 1027 

Let V be the space of all polynomial functions in R® of degree less than 3, with 
inner product (p,q) = aie p(t)q(t) dt. Find an orthonormal basis { po, 1, p2} 
of V satisfying deg(p;,) =h for h = 0, 1, 2. 


Exercise 1028 
Consider the function yz: R*? x R? > R given by 


a a’ 


LL: b|,| b’ b> 2aa’' +ac’+ca'+bb' 4+ cc’. 


Cc c 
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Show that jz is an inner product and find a basis of R? orthonormal with respect 
to pL. 


Exercise 1029 

Let V be an inner product space over R and let W be a finitely-generated sub- 
space of V with orthonormal basis {w1,..., wn}. Leta € Hom(V, W) be defined 
by a: vr >, (v, w;)w;. Show that a(v) — v € WH for all v € V and that 
|la(v) — v|| < ||w — v|| for alla(v) Awe W. 


Exercise 1030 
Let W be the subspace of R* spanned by linearly-independent subset 


ano 


which is an inner product space with respect to the dot product. Make use of the 
Gram-—Schmidt process to find an orthonormal basis for W. 


Exercise 1031 

Let n be a positive integer and let A € My x(R). If the set of rows of A is or- 
thonormal with respect to the dot product, is the same true for the set of columns 
of A? 


Exercise 1032 

Let n > k be positive integers and let A € Mx xn(R) satisfy the condition 
that its set of rows is orthonormal with respect to the dot product. Show that 
(ATA)? = ATA. 


Exercise 1033 

Let n be a positive integer and let A € My »(R). Show that A is symmetric if 
and only if for some k <n there exists a matrix B € M,,(R) and a real number 
r such that A= BB! +r/ and the columns of B are mutually orthogonal. 


Exercise 1034 


Let W be the subspace of R*, which is an inner product space with respect to the 
2 0 4 1 


dot product, generated by ; i : ; : , . Find an orthonormal 
0 2 4 1 


basis for W and an orthonormal basis for Wt. 
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Exercise 1035 
Consider R* as an inner product space with respect to the dot product. Add two 


1 1 
vectors to the set 4 ¢ 31°21 4 in order to get an orthonormal basis for 
=5 1 


this space. 


Exercise 1036 

Let n be a positive integer and let V = R”, which is an inner product space with 
respect to the dot product. Let {v,,...,v,} be an orthonormal basis for V, let 
aéR, and let 1 <h 4k <n. Define vectors w1,..., wy in V by setting 


cos(a)v;, — sin(a)u, ifi=h, 


w; = { sin(a)v_, —cos(a)uy ifi=k, 
Uj otherwise. 
Is {w,,..., Wn} an orthonormal basis for V? 


Exercise 1037 
Consider R? as an inner product space with respect to the dot product. Is there 


4k 0 —25 
ak €Z such that 4),,k], 16k is an orthonormal basis for this 
—k 4 —12k 


space? 


Exercise 1038 
Consider R* as an inner product space with respect to the dot product. Find an 


1 1 0) 
orthonormal basis for R 0 P ! , : 
1 2 1 
0) 1 2 


Exercise 1039 
Define an inner product on R? by setting 


(Hi Bi} =ac+ sad + be) + bd. 


Find an orthonormal basis for this space. 


Exercise 1040 
Consider R? as an inner product space with respect to the dot product. Let 
a, b,c, d be nonzero real numbers satisfying the conditions that a?+b?+c* = d? 
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a Cc 
and ab +. ac = bc. Show that the subset 4 | b ! Cc u a of R?3 
Cc 


is orthonormal. 


Exercise 1041 
Define an inner product on R? by setting 


a\ bi 
a2 |,} bo = aby + 2(azb2 + a3b3) — (aib2 + azb1) — (a2b3 + a3b2). 
a3 b3 


Find an orthonormal basis for this space. 


Exercise 1042 
Let m be a positive integer and let 


W={feC*| fli+m)= fi foralli €Z}, 


which is a subspace of the vector space C% over C. Define a function pw : 
WxW-C by setting w: (fg) bh eS, f(A)g(h). For each 0 < j <m, 
let f; € W be the function defined by 


Pe 1 ifh is of the form j + mi, fori € Z, 

JS ~~ 10 otherwise. 

Show that jz is an inner product on W and that, with respect to that product, 
{ fo,.--, fm—1} is an orthonormal basis for W. 


Exercise 1043 

A function f € R® has bounded support if and only if there exist real numbers 
a <b such that f(x) = 0 for all x not in the interval [a, b] on the real line. Let V 
be the set of all such functions and define a function ~: V x V > R by setting 
UfgZ= Chae ft (x)g(x) dx. For each k € Z, let f, € V be the function defined 
by 


1 ifk<x<k-+1, 
0 otherwise. 


fesse | 


Show that jz is an inner product on V and that the subset { f; | k ¢ Z of V is 
orthonormal with respect to this inner product. 


Exercise 1044 

Let V be the subspace of R® consisting of all infinitely-differentiable functions 
f which are periodic of period h > 0. (In other words, f(x +h) = f(x) for all 
x € R.) Define an inner product on V by setting (f, g) = ce FS (x) g(x) dx. Let 
a be the endomorphism of V which assigns to every element of V its derivative. 
Find a*. 
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Exercise 1045 

Let V be an inner product space over R and let {v,,...v,} be a set of mutually- 
orthogonal nonzero vectors in V. Let aj,...,a, be positive real numbers sat- 
isfying )>;_, aj = 1, and let w = )~_, ajv;. Suppose that w L vj; — v; for all 
1 <i j <n. Then show that ||w||-7 = 77, |luil|-. 


Exercise 1046 
Let V be a finitely-generated inner product space and let a, 61, B2 € End(V) 
satisfy a*aB, = a*aB2. Show that af; = af. 


Exercise 1047 
If W and Y are subspaces of a finitely-generated inner product space V, show 
that g(W, Y) is the maximum of 


sup{d(w, Y) | w € W and ||w|| = 1} and sup{d(y, W)| y €Y and ||y|| = 1}. 


Exercise 1048 
Let W and Y be subspaces of a finite-dimensional inner product space V satisfy- 
ing g(W, Y) < 1. Show that dim(W) = dim(Y). 


Exercise 1049 

Let K be an algebra over a field F on which we have an involution a bh a* 
defined. Let c be an element of K satisfying the conditions that c = c* and that 
c+c*—1 is a unit of K. Show that there exists an element d € K satisfying 
d* =d=d*,dc=c, and cd =d. Is d necessarily unique? 


Exercise 1050 
Let V be an inner product space over C and let a be an endomorphism of V 
satisfying «* = —a. Show that every eigenvalue of a is purely imaginary. 


Selfadjoint Endomorphisms 1 7 


Let V be an inner product space. An endomorphism a of V is selfadjoint if and 
only if (a(v), w) = (v,a(w)) for all v, w € V. Such endomorphisms always exist 
since 0, is selfadjoint for any c € R. Selfadjoint endomorphisms have important ap- 
plications in mathematical models in physics. For example, in mathematical models 
of quantum theory, selfadjoint operators on the state space of a system represent 
measurements which can be performed on the system. Note that if a € End(V) 
is selfadjoint, then (a(v), v) = (v,a@(v)) = (a(v), v) and so (a@(v), v) € R for all 
veV. 


Example Let V = C(O, 1), which is an inner product space over R in which 
(f,g)= i, St (x)g(x) dx. Then the endomorphism a of V defined by a(f) : x tb 
ie cos(x — y) f(y) dy for all f € V is selfadjoint. 


Proposition 17.1 Let V be an inner product space. Then: 

(1) Ifa € End(V) has an adjoint a*, then « + a® is selfadjoint; 

(2) Ifa € End(V) is selfadjoint, so is ca for each c € R; 

(3) Ifa € End(V) is selfadjoint, so is a" for each positive integer n; 

(4) Ifa, B € End(V) are selfadjoint so area + B and ae B, where e is the 
Jordan product in End(V); 

(5) Ifa € End(V) is selfadjoint and B € End(V) has an adjoint, then Bap* 
is selfadjoint. 


Proof (1) If v, w € V then 


((a + a*)(v), w) = (w(v), w) + (a*(v), w) 
= (v, a*(w)) + (v, a(w)) = (v, (@ + a*)(w)). 
(2) If v, w € V then ((ca)(v), w) = c(a(v), w) = c(v, a(w)) = (v, (ca)(w)). 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 395 
DOI 10.1007/978-94-007-2636-9_17, © Springer Science+Business Media B.V. 2012 


396 17 Selfadjoint Endomorphisms 


(3) This follows by an easy induction argument, using Proposition 16.17(3). 
(4) The selfadjointness of a + 6 is an immediate consequence of Proposi- 
tion 16.17(1). Also, recall that a e 6 = 5 (ap + Ba) and so, if v, w € V then 


1 1 
((we B)(v), w) = 5 (2B), w) + 5 (Bat), w) 


1 1 
= 5(Y Ba(w)) + 5 ("> aB(w)) =(v, (we B)(w)). 


(5) By Proposition 16.17, we have (Bap*)* = B**aB* = Bap*. 


In particular, if @ is a selfadjoint endomorphism of an inner product space V, and 
if p(X) € R[X], then p(q) is selfadjoint. The product of selfadjoint endomorphisms 
of V need not be selfadjoint, as we will see in the example after Proposition 17.4. 
Thus we see that the set of all selfadjoint endomorphisms of V is a subspace, though 
not necessarily a subalgebra, of End(V). 


Example Let V be an inner product space over R and let a € End(V) be selfadjoint. 
Let a,b € R satisfy the condition that a* < 4b. Then, by the previous remark, we 
know that 6 = a? + aa + bo is again a selfadjoint endomorphism of V. Moreover, 
ifOy Ave V then 


(B(v), v) = (a?(v), v) + alar(v), v) + b(v, v) 
= (a(v), a(v)) + a(a(v), v) + b(v, v) 


= |a(v) | + ala(v), v) + dljoll?. 


By Proposition 15.2, we know that |(a(v), v)| < ||a(v)|| - ||v|| and so 


(6), v) = la) |]* = lal - Je) - ell + dll? 


1! : 1 2 2 
= ([lacorl — 5a tot) +(6- ja Ye =i), 


Thus B(v) 4 Oy for each Oy 4 v € V, showing that 6 is monic. In particular, if V 
is finitely generated, this in fact shows that f is an automorphism of V. 


Let V be a finitely-generated inner product space having an orthonormal basis D. 
If a € End(V) and if pp(a@) = [a;;], then we know from Proposition 16.16 that 
Ppp(a*) = Ppp(a)"” = [a;;]". Therefore, if a is selfadjoint we have ajj = aji 
for all 1 <i, 7 <n. In particular, ajj = aj; for all 1 <i <n and so the diagonal 
entries in ®pp(a) belong to R. Matrices A over C satisfying the condition that 
A= A® are known as Hermitian matrices. When we are working over R, these are, 
of course, just the symmetric matrices. It is clear that the sum of Hermitian matrices 
is again a Hermitian matrix, but the product of Hermitian matrices is not necessarily 
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Hermitian, just as we have seen that the product of symmetric matrices is not neces- 
sarily symmetric. We do note, however, that if a matrix A € My x,(C) is Hermitian 
then so is A”. Indeed, if A and B are Hermitian matrices in Maxn(C), then their 
Jordan product 5(AB + BA) is a Hermitian matrix and so, in particular, the prod- 
uct of a commuting pair of Hermitian matrices is again Hermitian. Moreover, any 
matrix D € My xn(C) can be written in the form A+iB, where A = 5(D + D#) 
and B = —3(D — D") are both Hermitian matrices. If A € Myxn(C) is Hermitian, 
then so is cA for any c € R, and so the set of all Hermitian matrices in My xn(C) 
is a subspace of Mn xn(C), considered as a vector space over R; indeed, it is a 
subalgebra of the commutative Jordan R-algebra Mnxn(C)*. However, this set is 
not closed under multiplication by complex scalars, and so it is not a vector space 
over C. 

We have already seen that if V is an inner product space and if a € End(V) is 
selfadjoint, then (a(v), v) € R for all v € V. If V is finitely generated, the reverse is 
also true, as follows immediately from the following result. 


Proposition 17.2 Let V be an inner product space over C and let a € End(V) 
have an adjoint. If (a(v), v) € R for all v € V, then a is selfadjoint. 


Proof For vectors v, w € V, we have 
(a(v +w),ut+ w)) = (a(v), v) + (a(v), w) + (a(w), v) + (a(w), w) 


and since, by assumption, we know that the scalars (a#(v+ w),u+ w)), (a(v), v), 
and (a(w), w) are all real, we see that (a(v), w) + (a(w), v) € R as well. This 
implies that (a(v), w) + (a(w), v) = (w,a(v)) + (v,a(w)) and so i(a(v), w) + 
i(a(w), v) =i(w, a(v)) +i(v, a(w)). Similarly, 
(a(v +iw),vu+ iw)) = (a(v), v) _ i(a(v), w) + i(a(w), v) + (a(w), w) 
and so —i(a(v), w) + i(a(w), v) € R. This implies that 
—i(a(v), w) + i(a(w), v) — i(w, a(v)) _ i(v, a(w)) 


and so, multiplying by i and adding it to the previous result, we get 2(a(v), w) = 
2(w,a(v)), whence (a(v), w) = (w,a(v)). Therefore, a = a*. 


Proposition 17.3 Let V be an inner product space and let o9 4 a € End(V) 
be selfadjoint. Then there exists a vector v € V satisfying (a(v), v) £0. 


Proof First, assume that the field of scalars is C. Then it is easy to check that if 
v, w € V then 
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(a(v), w) = sllee +w),v+w)—(a(v—w),v—v)] 


+ S[lew + iw), v+in) (a(v iw),v iw)]. 


Moreover, each term on the right-hand side of this equality is of the form (a(y), y) 
for some y € V, soif all of these were equal to 0 we would see that a(v) L w for all 
v, w € V, which means that a(v) = Oy for all v € V, contradicting the hypothesis 
that o9 ~ a. Thus the desired result must hold. 

Now assume that the field of scalars is IR. Then for all v, w € V we have 
(a(w), v) = (w, a(v)) = (a(v), w) and so 


(a(v), w) — s lle +w),ut w) - (a(u —w),v— w)]. 


Again, each term on the right-hand side of this equality is of the form (a(y), y) for 
some y € V so if all of these were all equal to 0 we would have a(v) L w for all 
v, w € V which, as we have seen in the previous case, leads to a contradiction. 


We now return to a new variant of a question we have already posed: If @ is an 
endomorphism of an inner product space V, when does there exist an orthonormal 
basis of V composed of eigenvectors of a? If such a basis exists, we say that a is 
orthogonally diagonalizable. 


Example Let V = R? with the dot product, and let a € End(V) be given by 


a b 1 1 0 
‘ ae ak _ . . 
a:| b|t|a|. Then a 1], Fa 1],| 0 is an orthogonal basis of 
Cc 6 0 0) 1 


V composed of eigenvectors of a, so a is orthogonally diagonalizable. 


Proposition 17.4 Let V be an inner product space and let a € End(V) be 
selfadjoint. Then spec(a) C R and eigenvectors of a associated with distinct 
eigenvalues are orthogonal. 


Proof Let c be an eigenvalue of a and let v be an eigenvector of a associated 
with c. Then c(v, v) = (cv, v) = (a(v), v) = (v, a(v)) = (v, cv) = C(v, v) and so, 
since v ~ Oy, we see that c = ¢, proving that c € R. Thus we have shown the first 
assertion. 

If c and d are distinct eigenvalues of a associated with eigenvectors v and w, 
respectively, then c(v, w) = (cv, w) = (a(v), w) = (v, a(w)) = (v, dw) =d(v, w). 
Since c # d, this implies that (v, w) = 0. 
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Example The endomorphisms a and B of C* which are given by a : Bl a | 


—b 
{i, —i} and so, by Proposition 17.4, Ba is not selfadjoint. 


and 6: | a Al can easily be seen to be selfadjoint. However, spec(6a) = 


We note the following consequence of Proposition 17.4: If the matrix 
A € Mnpyxn(R) is symmetric, then all eigenvalues of A are real, and so the char- 
acteristic polynomial of A is completely reducible in R[X]. 

By Proposition 17.4, we see that if V is an inner-product space of finite dimen- 
sion n over C and if a € End(V) is selfadjoint, then the eigenvalues of a can be 
written uniquely as an n-tuple (cj(@),...,Cn(@)) of real numbers, the entries of 
which form a nonincreasing sequence. If a, 8 € End(V) are selfadjoint, the prob- 
lem of describing the possible sets of eigenvalues of a + 6 in terms of those of a and 
B is extremely important in particle physics, and is known as Weyl’s Problem. Wey] 
himself showed that if a, 6 € End(V), if 1 <k <i<n,and1<j<n—i+1, then 
Ci4j—1(@) + Cn—j41(B) < ci(@ + B) S ci—-n41(@) + cx (B) and so, if j =k = 1, we 
have cj (a) + cn(B) < ci)(@ + B) < ci(@) + €1(B) for each 1 <i <n. In particular, 
cy(a+ B) < cy (a) +1 (B) and cy (a) +¢n(B) < cn(a+ B). Since then, this problem 
has been extensively studied. A solution in terms of probability measures was given 
by the Australian mathematicians Anthony H. Dooley and Norman J. Wildberger, 
together with the Canadian mathematician Joe Repka, in 1993. 

Weyl’s result has many interesting consequences. We note that if V is an inner 
product space and if a € End(V) is selfadjoint and represented with respect to a 
given basis by some matrix A € M,,,(C) then the diagonal elements of A must 
also be real. Schur proved that if A is a matrix representing such an endomorphism 
a having eigenvalues cj (a) > --- > c,(a@) and diagonal entries pj > --- > p, then 
ia Pi < Whi ¢j(@) for all 1<k <n—1 and Y"_, pj = D4_, cj(@). The 
converse was proven by the American mathematician Alfred Horn: if c) > --- > cp 
and p; >--- > Py are sequences of real numbers satisfying via pix ee Cj 
for all 1 <k <n—1 and )"_, pj = Dj, c; then there exists a selfadjoint 


endomorphism a of C” with eigenvalues c;,...,c, and having an orthonormal 
basis relative to which @ is represented by a matrix having diagonal entries 
P1\ Hey Pn . 


We now turn to the problem of finding the eigenvalues of a selfadjoint endo- 
morphism of a finitely-generated inner product space. This problem arises in many 
important applications. For example, let I~ be a (nondirected) graph with vertex 
set {1,...,}. We associate to this graph a symmetric matrix, called the adjacency 
matrix [a;;], the entries of which are nonnegative integers, by setting a;; to be the 
number of edges in J” connecting vertex i to vertex j. The matrix represents a self- 
adjoint endomorphism of R” with respect to some basis and its spectrum can be 
used to derive important information about I”. This technique has important ap- 
plications in the analysis of computer networks, in the design of error-correcting 
codes, and in such areas as chemistry, where it is used to make rough estimates of 
the electron density distribution of molecules. Another example is the following: If 
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B= {v1,..., Un} is a set of distinct vectors in R”, and if || - || is a norm on R”, then 
the distance matrix defined by B is the matrix [||v; — v;||]. This matrix is symmet- 
ric and so defines a selfadjoint endomorphism of R”. Computing the eigenvalues 
of such matrices has important applications in many areas, including bioinformatics 
and X-ray crystallography. 


Proposition 17.5 [f V is a nontrivial finitely-generated inner product space, 
then spec(a) 4 @ for any selfadjoint endomorphism a of V. 


Proof Let a be a selfadjoint endomorphism of V. Choose an orthonormal basis 
B={v1,...,Un} for V and let A= ®gz(q). Since a is selfadjoint, we know that 
A= A", Let W =C" on which we have defined the dot product. Then the endo- 
morphism £ of W defined by 6: wt> Aw is selfadjoint. The degree of the charac- 
teristic polynomial |X J — A| of 6 is n > 0 and so, by the Fundamental Theorem of 
Algebra, it has a root c € C. Thus the matrix c/J — A is singular and so there exists a 
nonzero vector w € W satisfying Aw = cw. In other words, c € spec(f). By Propo- 
sition 17.4, this implies that c € R and so c € spec(a), even if V is an inner product 
space over R. 


In particular, we learn from Proposition 17.5 that every symmetric matrix over 
R has an eigenvalue in R. Compare this to the example we have already seen of a 
symmetric matrix in M2,2(GF(2)) having no eigenvalues. Similarly, the symmetric 


matrix E | € M2x2(Q) has no eigenvalues in Q. 


Let V be an inner product space finitely generated over C and let a be a self- 
adjoint endomorphism of V. We know, by Proposition 17.4, that the eigenvalues of 
@ are all real and that eigenvectors of a associated with distinct eigenvalues are or- 
thogonal. Let us denote the eigenvalues of a by c1,..., Cy where the indices are so 
chosen that c) > --- > c,. An important result known as the Courant—Fischer Min- 
imax Theorem states that, for each 1 < k <n, we have cy = supf{inf{(a(w), w) | 
w € W and ||w|| = 1}}, where the supremum runs over all subspaces W of V hav- 
ing dimension k. 

Let us look at this from a different perspective. The function which assigns to 
each Oy 4 v € V the scalar Ry(v) = (v, a(v)) ||v||~? is called the Rayleigh quotient 
function. Note that the projection z, defined in connection with the Gram—Schmidt 
theorem satisfies the condition that 7, : a(v) H Ra (v)v. By what we have already 
seen, the image D of this function is contained in R. Moreover, if v is an eigenvector 
of a with associated eigenvalue c, then Ra(v) = c, and so @ # spec(a) C D. On 
the other hand, it is possible to show—though we will not do it here—that D is 
contained in the closed interval [cy,c,] bounded by the largest and the smallest 
eigenvalues of a, both endpoints of which in fact belong to D. This observation 
can be used to define the Rayleigh quotient iterative scheme to find eigenvalues of a 
selfadoint endomorphism a: 
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As an initial guess, choose a normal vector vg and let dj) = Ry (v9). 
For k =0, 1,2,... repeat the following steps: 
(1) Ifa —dko, ¢ Aut(V), then d is an eigenvalue of a, and we are done; 
(2) Otherwise, aw — dyo; € Aut(V). Set y = (a — dgo,)~! (vg) and then compute 
Vet = lly 7y and desi = Ra (ve). 


This scheme will indeed produce an eigenvalue of a for all guesses of vg except 
those in a set of measure 0, and when it converges, the convergence is very rapid. Its 
main disadvantage is the time and effort needed in step (1) of the iteration to decide 
if « — d,oy is an automorphism of V or not (usually, if the matrix representing this 
endomorphism is nonsingular or not) and, if it is, to compute its inverse; the algo- 
rithm is therefore worthwhile only if this can be done without major computational 
effort. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut Oberwolfach 
(Fischer, Courant); With kind per- 
mission of the Science Photo Li- 
brary (Strutt). 

The twentieth-century 
German mathematicians 
Ernst Fischer and Richard 
Courant studied spaces of functions. Courant, who headed the Mathematics Institute at the 
University of Gottingen, fled Germany in 1933 and founded a similar institute in New York 
City, which now bears his name. John William Strutt, Lord Rayleigh, was a nineteenth- 
century British physicist and applied mathematician, who made important contributions to 
mathematical physics and who won the Nobel prize in 1904 for his discovery of the inert 
gas argon. 


Example Let a be the endomorphism of R? represented with respect to the canoni- 


21 1 
cal basis by the symmetric matrix A= | 1 3. 1 |. Then a is selfadjoint. Choose 
1 1 4 
1 
vo = A 1 |. Using the above algorithm, we see that dy) = Ra (vo) = 5, which is 
1 
not an eigenvalue of a. Moreover, 
0.3841106399... 
vy = | 0.5121475201... and dj =5.213114754..., 
0.7682212801... 
0.3971170680... 
v2 = | 0.5206615990... and dz =5.214319743.... 


0.7557840528 ... 
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The actual value of an eigenvalue of a is 5.214319744..., so we see that conver- 
gence was very rapid indeed. 


Proposition 17.6 Let V be a finitely-generated inner product space and let 
a € End(V). If W is a subspace of V invariant under a, then W* is invariant 
under a*. 


Proof If we W and ye W~. Then a(w) € W and so (w,a*(y)) = (a(w), y) =0, 
whence a*(y) € Wt. 


If V is a nontrivial inner product space finitely generated over R and assume that 
a € End(V) is orthogonally diagonalizable. Then there exists an orthonormal basis 
B={v1,..., Un} composed of eigenvectors of a. Thus ®g 2 (q@) is a diagonal matrix 
and so symmetric. In particular, ®g3(a*) = Og pla) = ®gp(a), which proves 
that @ = a* and so a is selfadjoint. The converse of this result follows from the 
following proposition. 


Proposition 17.7 Let V be a nontrivial finitely-generated inner product 
space and let a € End(V) be selfadjoint. Then a is orthogonally diagonal- 
izable. The converse holds if the field of scalars is R. 


Proof We will prove the result by induction on n = dim(V). For n = 1, we know 
by Proposition 17.5 that aw has an eigenvector v € V, and so {v1} is the desired 
basis, where v; = ||v||~!v. Now assume that n > 1 and that the proposition has been 
established for all spaces of dimension less than n. Pick vj as before and let W be 
the subspace of V generated by {v1}. Then V = W © W~ and, by Proposition 17.6, 
we know that W+ is invariant under a* = a. Moreover, W+ is an inner product 
space of dimension n — | and the restriction of a@ to W~ is selfadjoint. Therefore, 


by the induction hypothesis, there exists an orthonormal basis {v2,..., Un} of wt 
composed of eigenvectors of a. Since v1 is orthogonal to each of the vectors in this 
basis, we see that {v,,..., v,} is an orthonormal basis of V. 


Now assume that the field of scalars is R and that a € End(V) is orthogonally di- 
agonalizable. Then there exists an orthonormal basis D of V composed of eigenvec- 
tors of a. This means that ®pp(a) is a diagonal matrix, which is surely symmetric, 
and so by Proposition 16.16 we see that a is selfadjoint. 


Example The converse part of Proposition 17.7 is not true if the field of scalars 
is C. Indeed, consider the endomorphism a of C* represented with respect to the 


canonical basis by the matrix E ; | The characteristic polynomial of w is X?, 


so were it diagonalizable, it would have to be equal to oo. 
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Let V be an inner product space. An endomorphism @ € End(V) is positive defi- 
nite (resp., positive semidefinite) if and only if it is selfadjoint and satisfies the condi- 
tion that (a(v), v) is a positive (resp., nonnegative) real number for all Oy Ave V. 
If there exist Oy 4 v, w € V satisfying (a(v), v) > 0 > (a(w), w), then @ is indefi- 
nite. 

We see that o; is positive definite for any positive real number c. We also note 
that a positive-definite endomorphism must be monic since if @(v) = Oy implies that 
(a(v), v) =0 and so v = Oy. Therefore, every positive-definite endomorphism of a 
finitely-generated inner product space is in fact an automorphism. Positive definite 
endomorphisms have important applications in optimization and linear program- 
ming.! 


Example Let D = {z €C ||z| < 1}. If z1,..., Z, are distinct complex numbers in D, 
and if we are given complex numbers wj,..., Ww, in D, one can ask if there ex- 
ists an analytic function f : D— D satisfying f(z;) = w; for all 1 <i <n. The 
Nevanlinna—Pick Interpolation Theorem states that such a function exists if and only 
if the matrix [a;;] in which aj; = (1 — w;w;j)(01 — Sin) for all 1 <i, j <n, rep- 
resents a positive-definite endomorphism of C”. This theorem has been generalized 
considerably in many directions. 


Rolf Nevanlinna was a twentieth-century Finnish 
mathematician who worked mostly in analysis. 
Georg Pick was a twentieth-century Austrian earth 
geometer, who was a good friend of Einstein. 


Note that if a is positive definite and if Oy 4 v € V then 0 < (a(v), v) = |la(v)||- 
||v|| cos(t), where ¢ is the angle between a(v) and v, showing that 0 <t < 7 


Example Let V = R” on which we have the dot product defined, and let B be 
the canonical basis. An endomorphism @ of V is positive definite if and only if 
A = ®pgpz(a) is a symmetric matrix satisfying the condition that v! Av > 0 for all 
nonzero vectors v € V. Such matrices have nice properties. For example, it can be 
shown that if A is of this form then the Gauss—Seidel method applied to an equation 
AX = w will converge to the unique solution v, for any initial guess vo chosen. If 
a@ is positive definite, then the norm on V defined by ||v|| 4 = Vv! Av is called an 
elliptic norm. Any norm on V can be reasonably approximated by an elliptic norm, 
a fact of importance in numerical analysis. 


‘Systems of linear equations defined by positive-definite endomorphisms of R” first appear in 
Gauss’ work on least-squares approximation, which we will consider in a later chapter. 
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Example As an immediate consequence of the observation in the previous example, 
we see that the endomorphism a of R* defined by a : : a he : e satisfies the 


condition that (a(v), v) > 0 for all Oy ~£ v € V, but is not selfadjoint and so is not 


positive definite since ®gz(a) = is not symmetric. 


1 1 
0 1 
Example Even if og 4 a € End(V) is selfadjoint, it may be the case that neither a 


nor —a is positive definite. For example, if « € End(R7) is defined by a : © ie 


[I] 
[oJ mere ([o])-[o]=-r=eor(Lr]) Li} 


Example The endomorphism a : H bt k iH | of R? is selfadjoint and, for any 
_|a 
=1,/ 


v we check that (a(v), v) = (a+ b)* > 0, so a is positive semidefinite. On 


the other and, if v = i then (a@(v), v) = 0, so @ is not positive definite. Since 


-1 
: P : : _ {lol : 

a is represented with respect to the canonical basis by the matrix tou this also 

shows that in order for an endomorphism to be positive definite, it is not sufficient 

that it be represented by a matrix all of the entries of which are positive. 


Let V an inner product space. If a, B ¢ End(V) are selfadjoint, then a — 6 is also 
selfadjoint. We write a > 6 whenever a — f is positive definite. Thus, @ is positive 
definite if and only if a > oo. We write a > 6 if and only if a > B ora = B. We 
claim this is a partial-order relation on the set of all selfadjoint endomorphisms of V. 
Indeed, it is sure that a > @ for all such endomorphisms a. Suppose that a1, a2, and 
a3 are selfadjoint endomorphisms of V satisfying a1 > a2 > a3. If a] = a2 or a2 = 
a3 then it is clear that a; > a3. Let us therefore assume that a] > a2 > a3. Then 
for all v € V we see that ((a@, — a@3)(v), v) = (a1 (v) — a2(v) + a2(v) — @3(v), v) = 
(a1 (v) — a2(v), v) + (a2(v) — a3(v), v) > 0 and so a, > a3. Finally, assume that 
a, > a2 and a2 > a, but a; ~ a2. Then a; > a2 > a and so, as we have seen, 
a, > a, which is a contradiction. Thus we have a partial order on the set of all 
selfadjoint endomorphisms of V, called the Loewner partial order. 


The Czech mathematician Karl Loewner emigrated to the United 
States in 1933. His research concentrated in complex function theory 
and spaces of functions. 
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Example Let V be a finite-dimensional inner product space. Inequalities of the form 
i aja; > 00, where the a; are in End(V) and the qa; are scalars, play an impor- 
tant part in control theory, and have been studied extensively. 


Proposition 17.8 Let V be an inner product space and let a € End(V) be an 
endomorphism for which a* exists. Then a is positive definite if and only if 
the function L: (v, W) > (a(v), w) from V x V to the field of scalars is also 
an inner product. 


Proof First, let us assume that a@ is a positive-definite endomorphism of V. If 
v1, v2, w € V then p(vy + v2, w) = (a(v, + V2), wW) = (a(V1), W) + (a(V2), W) = 
[L(vj, W) + (v2, w) and, similarly, we show that w(cv, w) = cu(v, w) for all 
scalars c. We also see that u(v,w) = (a(v), w) = (v,a*(w)) = (v,a(w)) = 
(a(w), v) = (w, v). If Oy #4 v € V then, by the assumption of positive definite- 
ness, we see that jz(v, v) = (a@(v), v) is a positive real number, and it is clear that 
(Ov, Oy) = 0. Thus yw is an inner product on V. 

Conversely, assume that jz is an inner product on V. Then for all v, w € V we 
have (v, a*(w)) = (a(v), w) = w(v, w) = W(w, v) = (a(w), v) = (v, a(w)) and so 
a(w) =a*(w) for all w € V, proving that a is selfadjoint. Moreover, for all v € V 
we have (a(v), v) = “(v, v) for all Oy 4 v € V and so a is positive definite. 


Proposition 17.9 Let V be an inner product space, with a given inner product 
(v, w) b> (v, w), and let wu be another inner product defined on V. Then there 
exists a unique positive-definite endomorphism a of V satisfying the condition 
that W(v, w) = (a(v), w) forallu,we V. 


Proof Fix a vector w € V. The function vt» (vu, w) belongs to D(V) and so there 
exists a unique vector y, € V satisfying w(v, w) = (v, yw) for all v € V. Define 
a functiona:V > V bya: wt yy. Then (a(v), w) = (w,a(v)) = w(w, v) = 
LL(v, w) for all v, w € V. We claim that aw € End(V). Indeed, if w;, w2 € V then for 
all y € V we have 


(a(wi + w2), y) = w(wi + wo, y) = n(w1, y) + uw, y) 
= (w(w1), y) + (a(w2), y) = ((w1) + o¢(w2), y) 


and so a(w, + w2) =a(w1) +a(w2). Similarly, we can show that a(cw) = ca(w) 
for all w € V and all scalars c. Thus we see that a is indeed an endomorphism of V 
satisfying the condition w(v, w) = (a(v), w) for all v, w € V, and so it is positive 
definite. 

Finally, a has to be unique since if w(v, w) = (B(v), w) for all v, w € V, then 
((a@ — B)(v), w) = (a(v) — B(v), w) = (a(v), w) — (B(v), w) = 0 for all v, we V, 
which implies that (a — 6)(v) = Oy for all v € V, showing that a = p. 
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Proposition 17.10 Let V be a finitely-generated inner product space and let 
a €End(V). Then a is positive definite if and only if there exists an automor- 


phism B of V satisfying a = B*B. 


Proof Assume that there exists an automorphism 6 of V satisfying a = B*£. 
Then, as previously noted, a is selfadjoint. Moreover, for all Oy 4 v € V we have 
(a(v), v) = (B*B(v), v) = (B(v), B**(v)) = (B(v), B(v)) > 0 since B is an auto- 
morphism and hence 6(v) ¥ Oy. Therefore, @ is positive definite. 

Conversely, assume that a is a positive-definite endomorphism of V. Then the 
function ww: (v,w) + (a(v), w) is an inner product on V. Let {v1,..., Un} be a 
basis for V which is orthonormal with respect to the original inner product on V 
and let {w1,..., Wy} be a basis for V which is orthonormal with respect to jw. 
By Proposition 6.2, we know that there exists a unique endomorphism £ of V 
satisfying B(w;) = v; for all 1 <i <n. Then # is an epimorphism since its im- 
age contains a basis for V and so, since V is finitely-generated, it is an automor- 
phism of V. Therefore, if v = ))_; aiw; and w = ));_; bj w; are vectors in V we 
see that 


(a(v), w) = Lv, w) = u (Sram, Yo) 


i=l 


= 27 ji, v= Law i 


isl j=) 


and similarly 


(B*B(v), w) = (B(), poo)=(¢ 6( Soom) (om) 


n n 


n 
= Dd aid; (wi, wj) = ) audi, 
i=1 


i=l j=1 


and so we see that (6*8(v), w) = (a(v), w) for all v, w € V, which shows that 


a= p*B. 


From Proposition 17.10 we know that if A = [ajj] € Mnxn(C) is a matrix repre- 
senting a positive-definite endomorphism of C”, namely if it is a Hermitian matrix 
satisfying the condition that v - Av > 0 for all nonzero vectors v € C”, then there 
exists a nonsingular matrix B such that A = B” B. Indeed, we can choose B to 
be upper triangular, so that it is a form of LU-decomposition, though it takes only 
half as many arithmetic operations to perform. This decomposition is known as a 
Cholesky decomposition of A. This decomposition need not be unique. Cholesky 
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decompositions are widely used in building economic and financial models. Be- 
cause of this wide usage, there are many algorithms available to efficiently calculate 
Cholesky decompositions of general matrices or of matrices in special forms. In- 
deed, one of the computational advantages of the Cholesky decomposition is that it 
is numerically stable, even with no pivoting. On the other hand, if you change even 
one element of A you have to recompute the Cholesky decomposition of the new 
matrix from scratch. 


With kind permission of the Collections Ecole polytechnique (SABIX). 


Major André-Louis Cholesky was a cartographer in the French army, 
who used this method in connection with the mapping of the island of 
Crete before World War I. It had previously been used by other cartog- 
raphers, including Myrick H. Doolittle, of the computing division of 
the US Coast and Geodetic Survey, in 1878. A mathematical formula- 
tion had been given earlier by Toeplitz. 


The following algorithm calculates a Cholesky decomposition for real symmetric 
matrices. 


For k=1,...,n perform the following steps: 
(1) Foreach 1 <i <k define bi, =b;,'[aix — ae bjibjxls 
(2) Set dix = ark — W42} B}, 
(3) Foreach k <i <n set bj, =O. 


Note that if the matrix A did not satisfy v- Av > 0 for all nonzero vectors v, the 
algorithm would hang up at some stage, trying to take the square root of a negative 
number. Indeed, attempting a Cholesky decomposition is often used as a test to see 
whether a given matrix represents a positive-definite endomorphism or not. 


5 2 3 
Example Let A=|2 1 1 | €.M3,3(R). This is a symmetric matrix satisfying 
3 1 4 


the condition that v - Av > 0 for all nonzero vectors v € R? and having a Cholesky 
5/5 2/5 3/5 
decomposition B TB, where B= u 0 ff =—/5 |; 


5 
0 & Sy? 

Notice that the Proposition 17.10 extends our ongoing analogy between the op- 
eration * and the conjugate operation on C, just as the notion of “positive definite” 
is the analog of positivity of complex numbers: a complex number z is (real and) 
positive if and only if there exists a complex number y such that z = yy. 


Cholesky decompositions do not work for Hermitian matrices representing indef- 
inite endomorphisms of C”. In such cases, one has to make use of other methods, 
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such as the Bunch—Kaufman algorithm, which is quite effective for sparse matri- 
ces. 


Proposition 17.11 Let V be an inner product space. If « € End(V) is positive 
definite, then every eigenvalue of a is a positive real number. The converse 
holds if a is orthogonally diagonalizable. 


Proof Assume that a is positive define. By Proposition 17.4, the eigenvalues of a 
are real numbers. If c € spec(a) is an eigenvalue of a associated with an eigen- 
vector v, then 0 < (a(v), v) = (cv, v) = c(v, v) and so c > 0, since we know 
that (v, v) > 0. Conversely, assume that every eigenvalue of a is positive and that 
there exists an orthonormal basis B of V composed of eigenvectors of a. Let 
v= ys where {v1,..., Un} C B. For each 1 <i <n, let c; be an eigen- 
value of a associated with v;. We can assume that the v; are arranged in such a 
manner that 0 < cy < co <--- < cy. Then 


(a(v), v) = eas au Se (Ci Uj, Vj) 


i=1 j=l i=1 j=l 


So (uj, 0j) = eile? =e Sail? > 0, 


f= 1, 9=1 i=1 i=l 


and so @ is positive definite. 


From Propositions 17.11 and 17.7, we see that if V is an finitely-generated in- 
ner product space over R or C and if a € End(V) is positive definite, then there 
exists a basis of V relative to which q@ is represented by a diagonal matrix in which 
the entries of the diagonal are positive real numbers. Such a matrix is, of course, 
nonsingular. 


Proposition 17.12 Let V and W be inner-product spaces finitely-generated 
over R and let a € Hom(V, W). Then ||a|| = /c, where c is the largest eigen- 
value of a*a € End(V), and where ||a|| is the norm induced by the respective 
inner products on V and W. 


Proof If c is an eigenvalue of 6 = a*a then there exists a nonzero vector v such that 
B(v) = cv and so cul]? = (v, cv) = (v, a*a(v)) = (a(v), a(v)) > 0, and so c > 0. 
By Proposition 17.7, we know that there exists a basis {v1,..., Un} of V composed 
of orthonormal eigenvectors of 6. For each | <i <n, let c; be an eigenvalue of 
B associated with v;. After renumbering, we can assume that 0 < cj <--- < cy. If 
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v= )-/_, aju; € V, then 


n 


||av) ||? = (v, B@)) =(v, wav) = a! cia; (vj, Vi) 


j=li=1 
n 

2 2 

= Deal sox( Sve?) =cut 
i=l 


and so ||a(v)||7/|lv||? < cy. Therefore, by definition of the induced norm, ||a|| < 
/cn. But one easily sees that ||a(vn)||7/||Un|l7 = cn and so /é_ < |la||, proving 
equality. 


Example Let a : R? —> R? be the linear transformation defined by a: vb Au, 


1 -—2 1 
where A=| 3 0 —1 


10 —2 -—2 
—2 4  —2 | v. The eigenvalues of this endomorphism are 0 < 8 — 2/2 < 
—2 —2 2 


8 + 2/2 and so |la|| = V8 + 2V2, which is approximately equal to 3.291. 


| Then a*qa is the endomorphism of R? given by v 


Let V and W be inner product spaces. A linear transformation a : V — W pre- 
serves inner products if and only if (v1, v2) = (a(v1), a@(v2)) for all vj, v2 € V. 
Notice that any linear transformation which preserves inner products also pre- 
serves distances: ||v; — v2|| = ||a(v1 — v2)|| = |le(v1) — @(v2)|| for all v1, v2 € V. 
Also, as a direct consequence of the definition, such a linear transformation pre- 
serves the angles between vectors. Conversely, we have already noted that from the 
norm defined by an inner product we can recover the inner product itself, so that 
any linear transformation a : V > W satisfying ||v; — v2|| = |la(v1) — a(v2)|| for 
all vj, v2 also preserves inner products. Such a linear transformation is called an 
isometry. 


Proposition 17.13 Let V be an inner product space over C and let a € 
End(V) be an isometry. Then the eigenvalues of a lie on the unit circle 
{zEC| |z| = 1}. 


Proof If c is an eigenvalue of w with associated eigenvector v, then || v||* = ||cu||* = 
Ic? ||v||? and so |c| = 1. 


Example Let V be an inner product space over R and let Oy 4 y € V. This vector 
y defines an endomorphism ay of V by setting 


SZ 
(y, y) 


dy: ure —v+2 
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This endomorphism is an isometry which satisfies we = 01, and y is a fixed point 
of ay. 


Proposition 17.14 Let V and W be finitely-generated inner product spaces 

and having equal dimensions. Then the following conditions on a linear trans- 

formation a: V — W are equivalent: 

(1) @ is an isometry; 

(2) @ is an isomorphism which is an isometry; 

(3) If{v1,..., Un} is an orthonormal basis of V then the set {a(v1),...,@(Un)} 
is an orthonormal basis of W. 


Proof (1) => (2): If Oy #4 uv € V then (v, v) = (a(v), a(v)), and so a(v) 4 Ow. 
Thus we see that ker(~) = {Oy} and so @ is an isomorphism, since V and W have 
the same finite dimension. 

(2) => (3): If {v1, ..., v,} is an orthonormal basis of V then, since @ is an isomor- 
phism, we see that {a(v1),..., @(Up)} is a basis for W. Moreover, for all 1 <i, j <n 
we know that 
1 wheni = j, 

0 otherwise, 


(a(v;), w(vj)) = (vi, vj) = | 


and so this basis is orthonormal. 
(3) = (1): Let {v1,..., Un} be an orthonormal basis of V. If v = yy, av; and 
y= Di —1 bjv;, then (v, y) = ))7_) a;b;. Moreover, 


wor evr}=(o(San).o( Zoe) 


= >) > aibj(avi), a(vj)) = > aibi = (v, y), 
i=l 


11. 7=1 


and this proves (1). 


In particular, if V and W are finitely-generated inner product spaces having equal 
dimensions, then every isometry a : V — W is an isomorphism. If w 1, w2 € W, 
then (w1, w2) = (aa~!(w 1), aa~!(w2)) = (a !(w1), a7! (w2)) and so we see that 
a! is also an isometry. Moreover, there is always at least one isometry aw from V 
to W. Just pick orthonormal bases {v1,..., Un} for V and {w1,..., wn} for W and 


define a by a: 7) ajuj RH OP aw. 


Example The endomorphism of R? represented with respect to the canonical basis 
V3 V2 -1 
by ae J3 —J/2 1 | is an isometry. 
Vo 
0 V2 2 
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Example Let W be the set of all matrices A € M3,3(R) satisfying A’ = —A, 
which is a subspace of M3,3(R) of dimension 3. Define an inner product on W 
as follows: if A, B € W then (A, B) = 4 tr(AB’). Let V = R°, which is an in- 
ner product space with respect to the dot product. Define a linear transformation 


a 0 -c b 0 -c b 
a:V—>W by settinga:| bb] b c 0 -a|.IfA= c 0 -a 
c —b a 0 —b a 0 
—f e cf +be —bd —dc 
andB=| f 0 —d | then ABT = ea cf tad ec and so 
—e d 0 —af fb be+ad 
a 
we can check that (A, B) = | b |-| e | and thus a is an isometry, and hence is an 
Cc 
f 


isomorphism. 


Example Proposition 17.14 is no longer true if we remove the condition that the 
spaces are finitely generated. Indeed, let V = C(O, 1), on which we have the inner 
product uw(f, g) = i f (x)g(x)x? dx, and let W be the same space on which we 
have the inner product (f, g) = i Sf (x)g(x) dx. Leta: V — W be the linear trans- 
formation defined by a: f(x) xf (x). Then w(f, g) = (a(f), a(g)) and so @ is 
an isometry. But a is not an isomorphism since the function x +> x* + 1 does not 
belong to the image of a. 


Let us now return to the case of inner product spaces the dimensions of which 
are not necessarily equal. 


Proposition 17.15 Let V and W be inner product spaces finitely-generated 
over R and let a € Hom(V, W). Then o is an isometry if and only if o*o = 
o, € End(V). 


Proof By Proposition 16.15, a* exists. If a*@ = 0, € End(V), and if vy, v2 € V 
then 


2 
llvr — v2|I° = (vp — v2, v1 — v2) = (v1 — v2, @* (v1 — v2)} 


2 
’ 


= (a(v1 — v2), «(v1 — v2)) = |la(v1) — a(v2) 


and so ||vy — v2|| = ||a(vz) — e@(v2)||, proving that w is an isometry. Conversely, if 
a is an isometry and if v1, v2 € V then (a*a(vj), v2) = (a(v1), @(v2)) = (v1, v2). 
Therefore, by Proposition 16.14, we see that w*a(v1) = vj forall vy; € V, soa*a = 
o, € End(V). 
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Exercises 


Exercise 1051 
Let V = C[X] and define an inner product on V by setting 


oo or) oo 
(Sax! Sn] — So aii. 
i=0 i=0 i=0 


Let a be the endomorphism of V defined by a : p(X) (X + 1) p(X). Calculate 
a*, or show that it does not exist. 


Exercise 1052 
Let V = C[X] and define an inner product on V by setting 


oo oo oo 
(Dax! Sona] = Saji. 
i=0 i=0 i=0 


Let 6 be the endomorphism of V defined by 6 : p(X) +> p(X + 1). Calculate 
B*, or show that it does not exist. 


Exercise 1053 

Let p > 1 be an integer, let G = Z/(p), and let V = C%, which is an inner 
product space over C with inner product defined by (f, g) = V,cg f(m)g(n). 
Let a be the endomorphism of V defined by a(f):nbBe fn+14+ fir —1). 
Is aw selfadjoint? 


Exercise 1054 

Let V be a vector space over R. A nonempty subset K of V is convex if and 
only if cv ++ (1 —c)w € K whenever v, w € K and 0 <c <1. Is the set of all 
selfadjoint endomorphisms of an inner product space Y over R necessarily a 
convex subset of the vector space End(Y)? 


Exercise 1055 
Let V be an inner product space and let a be an endomorphism of V. Is the 
endomorphism a*a — o, of V selfadjoint? 


Exercise 1056 

Let n be a positive integer and let V be the space of all polynomial functions 
in R® of degree at most n. Define an inner product on V by setting (f, g) = 
yar f (t)g(t) dt. Let w € End(V) be defined by a(f): xt (1 — x”) f’(x) — 
2x f’(x). Show that @ is selfadjoint. 


Exercise 1057 
Let V be an inner product space finitely generated over C and let a be an endo- 
morphism of V satisfying wa* = a*. Show that a is selfadjoint. 
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Exercise 1058 
Let V be an inner product space finitely generated over C and let a and f be 
selfadjoint endomorphisms of V satisfying the condition that af is a projection. 


Is Ba necessarily also a projection? 


Exercise 1059 
Give an example of nonzero Hermitian matrices A and B satisfying AB = O = 


BA, or show that no such matrices exist. 


Exercise 1060 

Let A € M2 x2(C) be Hermitian. Find real numbers w, x, y, and z satisfying 
|A| = w? — x? — y? — 2?. 

Exercise 1061 

Let O 4 A € M3,.3(C) be a Hermitian matrix. Show that A‘  O for all positive 


integers k. 


Exercise 1062 


a 0 b 
Find complex numbers a and b such that | 0 2a a | € M3 x3(C) is a Hermi- 
i lia 


tian matrix. 


Exercise 1063 
Determine all Hermitian matrices A € M5x5(C) satisfying A>+2A?+3A = 61. 


Exercise 1064 
A matrix A € Myxn(C) is anti-Hermitian if and only if A = —A. Show that A 
is anti-Hermitian if and only if 7A is Hermitian. 


Exercise 1065 
If matrices A, BE My xn(C) are anti-Hermitian, show that the Lie product of A 
and B is also anti-Hermitian. 


Exercise 1066 
Let n be a positive integer and let A € Mnxn(C). Show that every eigenvalue of 
A" A isa positive real number. 


Exercise 1067 
Let V be an inner product space and let a € End(V) be selfadjoint. Show that 
ker(a) = ker(a") for all h > 1. 
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Exercise 1068 

Let V be a nontrivial finitely-generated inner product space and let a € End(V) 
be orthogonally diagonalizable and satisfy the condition that each of its eigen- 
values is real. Is w necessarily selfadjoint? 


Exercise 1069 
Let w € End(R?) be represented with respect to the canonical basis by the matrix 


1 2 3 
a_ 4 |. For which values of a is a positive definite? 

4 5 

Exercise 1070 

For each complex number z, let w, be the endomorphism of C? represented with 


1 1 -1 
respect to the canonical basis by 1 1 z |. Does there exist a z for which 
-1 Zz 1 


this endomorphism is positive definite? 


Exercise 1071 
Let V be an inner product space and let a € End(V) be positive definite. Is a7 
necessarily positive definite? 


Exercise 1072 
Let a be a positive definite automorphism of an inner product space V. Is a7! 
necessarily positive definite? 


Exercise 1073 
Do there exist a, b, c,d € R such that the endomorphism of R* represented with 


1 lao 
: : 1 1 by. wa : 
respect to some basis by the matrix c 11138 positive definite? 
Odili 


Exercise 1074 

Let V be an inner product space finitely generated over R and let a € End(V). 
Let D be a fixed basis for V and let A = ®pp(q@). Recall that we can write 
A=B+C, where B= (A +A’) is symmetric and C = (A — A’) is skew 
symmetric. Let 6, y € End(V) satisfy B = ®pp(f) and C = ®pp(y). Show 
that a is positive definite if and only if y is positive definite. 


Exercise 1075 

Let a be a positive semidefinite endomorphism of R”, represented with respect 
to the canonical basis {v1,..., vy} by asymmetric matrix A = [aj;j] € Mnxn(R). 
Show that |a;j| < 5 (aii + aj;) for all 1 <i, j <n. 
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Exercise 1076 


a\ 
Let U be the set of all vectors : Te R® satisfying the condition that 
6 
aj ag az 
a2 d4 as | 1s positive semidefinite. Is U a convex subset of R°? 
43 a5 6 


Exercise 1077 


Do there exist real numbers a,b,c, d such that the matrix is 


Coa Fe 
Qe ee 
=e eo 
pee SF © 


positive semidefinite? 


Exercise 1078 

A selfadjoint endomorphism a of R” is almost positive semidefinite if and only 
ay 

if w(v) + v > 0 for all nonzero vectors v= | : | satisfying }~_, a; = 0. Give 
an 

an example of an endomorphism of R? which is almost positive semidefinite but 

not positive semidefinite. 


Exercise 1079 


Let k and n be positive integers. A symmetric matrix in Min pin(R) is 
T 


A C 
ing a positive-definite endomorphism of R* with respect to the canonical basis, 
and C is a matrix representing a positive-definite endomorphism of R” with re- 
spect to the canonical basis. Show that a quasidefinite matrix is nonsingular, and 
that its inverse is again quasidefinite. 


quasidefinite when it is of the form where B is a matrix represent- 


Exercise 1080 

Let V = R* together with the dot product. Find positive-definite endomorphisms 
a and £ of V satisfying the condition that their Jordan product is not positive 
definite. 


Exercise 1081 
Let V be an inner product space over R and let a € End(V). Show that a is 
positive definite if and only if w + a* is positive definite. 


Exercise 1082 

Let V be an inner product space finitely generated over C and let a and 6 be 
positive-definite endomorphisms of V satisfying wf = oo. Is it necessarily true 
that a = oo or B = 00? 
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Exercise 1083 


cos 


Let V= a,b,c €R}$ and let W = R?, both of which together with the 


c 
dot product, are inner product spaces of dimension 3 over R. Find an isomor- 
phism a : V — W which is also an isometry. 


Exercise 1084 
Let V be an inner product space and let ~ be an endomorphism of V which is an 
isometry. Does a also preserve angles between vectors? 


Exercise 1085 

Let a be a positive-definite endomorphism of a finite-dimensional inner prod- 
uct space V represented with respect to some fixed basis by an n x n matrix 
A= Laj;]. Show that |A| < ce Qij- 


Exercise 1086 

Let n be a positive integer and let a be a positive-definite endomorphism of 
C” represented with respect to the canonical basis by the matrix A = [a;;] € 
Mnxn(C). Show that aj; is a positive real number for all 1 <i <n. 


Exercise 1087 


a 

Let a : R? — R? be the linear transformation defined by a: | b | k - a 
c 

Calculate spec(aa*) and spec(a*a). 


Exercise 1088 

Let n be a positive integer and let V = R”, on which we have defined the dot 
product. Let w be a positive-definite endomorphism of V represented with respect 
to the canonical basis by the matrix A = [a;;] € Mnxn(R). Show that |A| > 0. 
Is it necessarily true that tr(A) > 0? 


Exercise 1089 
Find endomorphisms a, 6 € End(C?) satisfying aw > 6 (in the sense of Loewner) 
but not a? > B?. 


Exercise 1090 
Let V be an inner product space and let a, 6 € End(V) be positive definite. Is it 
necessarily true that a + B > B? 


Exercise 1091 
Let V and W be inner product spaces finitely generated over C. Let 
a € Hom(V, W), 6 € Aut(V), and g € Aut(W), where the automorphisms 0 and 
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gy are positive definite. Show that the automorphism 6 — a* ga of V is positive 
definite if and only if the automorphism y — a@a* of W is positive definite. 


Exercise 1092 

Let w € End(R”) be represented with respect to the canonical basis by a symmet- 
ric matrix A. Show that (a(v), v) > 0 for all nonzero vectors v € R” if and only 
if a + coy is positive definite for every positive real number c. 


Exercise 1093 

Let V be the vector space of all infinitely-differential functions in R™, on which 
we define the inner product (f, g) = in tS (x)g(x) dx. Let W be the subspace of 
all functions f € V satisfying f(0) = f(2) = 0. Show that the endomorphism 
of W defined by ft f” is selfadjoint. 


Exercise 1094 
Let V be a finitely-generated inner product space and let a € End(V) be selfad- 
joint. Show that ||a(v)|| < ||v|| for all ve V. 


Exercise 1095 

Let V be an inner product space finitely generated over R of dimension greater 
than 1, and let w be a selfadjoint endomorphism of V. Show that there are eigen- 
values c < d of a satisfying c||v||? < (a(v), v) <d|lv||? for all v € V. 


Exercise 1096 

Let V be an inner product space finitely generated over R and let a, 8 € End(V) 
be selfadjoint. Assume that the eigenvalues of @ all lie in the interval [a, b] on 
the real line and that the eigenvalues of f all lie in the interval [c, d] on the real 
line. Show that the eigenvalues of a + 6 all lie in the interval [a + c,b+d] on 
the real line. 


Exercise 1097 
Let V be an inner product space finitely generated over C and let a be a positive- 
definite selfadjoint automorphism of V. Show that ((a + a~!)(v), v) > 2(v, v) 
for allue V. 


Exercise 1098 

Let V be an inner product space finitely generated over R and let a be a positive- 
definite selfadjoint automorphism of V. Show that (a-! (v), v) = max{2(v, w) — 
(a(w), w) | we W}forallue V. 


Exercise 1099 

Let V be a finite-dimensional inner product space over C and let a $ oj positive- 
definite endomorphism of V. Show that there exists no positive integer p satis- 
fying wa? =o}. 
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Exercise 1100 

Let V be a vector space finitely-generated over R and let a € End(V) be selfad- 
joint. Show that at least one of the values +||a|| is an eigenvalue of a and any 
eigenvalue c of a@ satisfies —||a|| <c < |la||. 


Exercise 1101 


Let A= E "| € M2 x2(C) be Hermitian and let r > s be the (necessarily 


real) eigenvalues of A. Show that |b| < 5(r —s). 


Exercise 1102 

Let V be a nontrivial finitely-generated inner product space and let w and 6 be 
selfadjoint endomorphisms of V satisfying a8 = Ba. Show that a and 6 have a 
common eigenvector. 


Exercise 1103 

Let n be a positive integer. An endomorphism @ of R” is copositive if and only if 
it is selfadjoint and satisfies the condition that a(v) - v is a positive real number 
whenever v is a nonzero vector all components of which are nonnegative. Clearly, 
positive-definite endomorphisms are copositive. Give an example of a copositive 
endomorphism which is not positive definite. 


Exercise 1104 
Is the endomorphism of C? represented with respect to the canonical basis by the 
4 2-i 1+i 
matrix | 2+i7 3 0 € M3x.3(C) positive definite? 
1-i 0 2 


Exercise 1105 


Let a be the endomorphism of R? defined by setting a : H bt E 7 ak 


Show that a is positive definite by constructing an endomorphism £ of R? satis- 
fying a = p*B. 


Exercise 1106 
Find selfadjoint automorphisms w and f of R? satisfying the condition that 


a> B>-—a buta ¢ /p*B. 


Exercise 1107 

Let w € End(R”) be represented with respect to the canonical basis by a symmet- 
ric matrix A = [a;;]. Let 8 € End(IR”) be represented by the matrix B = [e“/]. If 
a is positive semidefinite, is 6 necessarily positive semidefinite? Is 6 necessarily 
positive definite? 


Unitary and Normal Endomorphisms 1 8 


Let V be an inner product space. An automorphism of V which is an isometry is 
called a unitary automorphism. It is easy to see that if @ and B are unitary automor- 
phisms of V then wf and a~! are also unitary automorphisms of V. It is also clear 
that oj is unitary. Therefore, the set of all unitary automorphisms of V is a group of 
automorphisms. 


Proposition 18.1 Let V be an inner product space and let a € Aut(V) have 


an adjoint. Then o is unitary if and only if a* = a7'. 


Proof If a is unitary then (a(v), w) = (a(v),aa~!(w)) = (v,a7!(w)) for all 
v,w € V and so a* = a~!. Conversely, if a* = a! then (a(v),a(w)) = 
(v, @*a(w)) = (v, w) for all v, w € V and so a is unitary. 


As a direct consequence of Proposition 17.14, we see that if V is an inner product 
space finitely generated over its field of scalars then for a € End(V) the following 
conditions are equivalent: 

(1) @ is an isometry; 
(2) @ is unitary; 
(3) @ maps an orthonormal basis of V to an orthonormal basis of V. 

If V is an inner product space finitely generated over its field of scalars F’, and 
if @ is a unitary automorphism of V represented by a matrix A = [ajj] € Mnxn(F) 
with respect to a given orthonormal basis, then we see that A= A® © Mayxn (F). 
A matrix of this form over F is called a unitary matrix. If A is a unitary matrix 
then so is A! since (A7!)# = (A#)~!. Also, if A and B are unitary matrices 
then (AB)~! = B-!A47! = B4 AF = (AB)? 50 AB is also unitary. The converse 
-1 1 


is false. For example, the matrix A = 01 


€ M2x2(R) is not unitary, but 


A? = Tis. 
Thus we see that the set of unitary matrices in M,»(F) define a group of au- 
tomorphisms of F” and so an equivalence relation ~ defined by the condition that 
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A ~ B if and only if there exists a unitary matrix P such that A= P~!BP. Matrices 
equivalent in this sense are unitarily similar. As an immediate consequence of the 
definition, we see that A is unitary if and only if the set of columns (resp., rows) of 
A is an orthonormal basis of F” (resp., Mx» (F')) endowed with the dot product. 


Proposition 18.2 Let n be a positive integer and let A = [a;;| and B = [bj;] 


be unitarily-similar matrices in Mn xn(C). Then 


n n 


n n 
> ere ea 
1 


rl j= tt y=! 


Proof We note that >¥_, 7; la;j\> = tr(A” A). If P is a unitary matrix sat- 
isfying B = P~'AP then tr(B” B) = tr(P-'A¥ PP—!AP) = tr(P~'A# AP) = 


tr(A4 AP—! P) = tr(A" A), and we are done. 


Example If c,d € C satisfy the condition that |c|* + |d|? = 1, then the matrix 
E | € Mox.2(C) is unitary. A matrix of this form is known as a Givens ro- 
tation matrix. More generally, if n > 3 then a matrix A = [ajj] € Maxn(©) is a 
Givens rotation matrix if and only if there exist integers | < h < k <n and nonzero 
complex numbers c and d satisfying |c|* + |d|* = 1 such that 


Cc ifi= j €{h,k}, 
1 ifi=j ¢{h,k}, 

ajj=4d ifi=hand j =k, 
-—d ifi=kandj=h, 
0 otherwise. 


These matrices play important roles in numerical algorithms. 


© Walter Gander. 


% James Wallace Givens, a former assistant to von Neumann and con- 
sidered one of the fathers of the twentieth-century American numerical 
analysis, made major contributions to numerical matrix computation. 


Example The matrix A = 5 k - _ 


important applications in the modeling of quantum computing, where it is often 


€ M2 x2(C) is unitary. This matrix has 
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denoted by VNOT, since A* = i | represents the negation operator in this 


context. 


/ 2 . 
Example It is easy to show that Ap = an Jel € M2 x2(C) satisfies 


Ap Ar = I for any real number J, but, except for the case of b = 0, it is not unitary. 


Unitarily-similar matrices are surely similar, but the converse is not true. 


—2 0 0 2 
tarily similar, as we can see from Proposition 18.2. 


Example The matrices ? | and ae in. M>2,.2(R) are similar but not uni- 


Proposition 18.3 (Schur’s Theorem) /f n is a positive integer, then every 
matrix in My xn(C) is unitarily similar to an upper-triangular matrix. 


Proof We will proceed by induction on n. For n = 1, the result is trivial since every 
1 x 1 matrix is upper triangular. Assume now that n > 1 and that the result has 
been established for M (n—1)x(n—1)(C). Let A = [aij] € Maxn(C). Since we are 
working over C, we know that the characteristic polynomial of A is completely 
reducible, and so A has an eigenvalue, call it c,. Corresponding to that eigenvalue, 


dy 

we have a normal eigenvector vj = | : | in which we can assume that d; € R. 
dn 

We now are able to construct a basis {v1, ..., Un} for C” to which we can apply the 


Gram-Schmidt procedure, and thus assume that it is in fact an orthonormal basis 
(the vector v; does not change, since it was assumed to be normal to begin with). 
The matrix P;, the columns of which are these vectors, is therefore unitary. Now 
Cl 
0 
set Aj = Py AP. It is easy to see that the first column of A, is of the form 


0 

so we can write A, in block form as OS yi , where Az € Mi—1)x(n—1) (C). 
2 

By the induction hypothesis, there is a unitary matrix Q € M(q—1)x(n—1)(C) such 


that Q~'A2Q is an upper-triangular matrix. Now set P) = | Then P> is 


O 
O Q 


a unitary matrix in My x,(C) and Py Pr AP, Py= ie gai 5 is an upper 
2 


triangular matrix in My »(C). Since P; Pz is again unitary, we are done. 
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If we are working over R, then a matrix A representing a unitary automorphism 
of R” satisfies A~! = A’. Such a matrix is called an orthogonal matrix. It is clear 
that the matrix J is orthogonal and that A~! is orthogonal whenever A is orthogonal. 
If A and B are orthogonal matrices then (AB)-! = B-!A7! = BTAT = (AB)? 
and so AB is also orthogonal. As an immediate consequence of the definition, we 
see that A is orthogonal if and only if the set of columns (resp., rows) of A is an 
orthonormal basis of R” (resp., M1 x»(R)) endowed with the dot product. It is also 
clear that A is orthogonal if and only if A? is orthogonal. 


Example Permutation matrices, which we considered earlier, are clearly orthogonal. 

: cos(t)  sin(t) cos(f) sin(t) 
Example The matrices E sits) coats) and sii cost) 
for every ¢ € R, and one can show that these are the only orthogonal matrices in 


Mo? x2(R). Indeed, suppose that the matrix ie | € M2 x2(R) is orthogonal. 
21 a22 


are orthogonal 


Then aa + on =1= a + fe; so —1 < aj, < 1. Hence there exists a real number 
t such that aj; = cos(t). Then Gs =l1- ae = 1—cos*(t)= sin (t) and so a12 = 
+sin(t). Also, ajj = cos(—t) and sin(—t) = — sin(t). Thus, replacing t by —t if 
necessary we can assume that a); = cos(t) and a;2 = sin(f). Similarly, there exists 
an angle s such that a22 = cos(s) and a2; = sin(s). Matrices of the first type are just 
Givens rotation matrices; matrices of the second type are known as Jacobi reflection 
matrices. 

Since 0 = aj1a21 + a12422 = cos(t) sin(s) + sin(t) cos(s) = sin(t + 5), we see 
thatt+s=Oort+s=a. If t+s=0, we obtain A = COS) eal 

—sin(t) cos(f) 

cos(f) sin(t) 


t+s=a,thens=2—tandsoA= kee —cos(t) 


since sin(t) = sin(z — ft) 


and — cos(t) = cos(z —f). 
One can also show that every orthogonal matrix in M3,3(R) is similar to a 
cos(t)  sin(t) 0 
matrix of the form | —sin(t) cos(t) 0 | for some t € R. More generally, if 
0 0 +1 
n > 2 then every orthogonal matrix in M,.,(R) is similar to a matrix in block 
form [D;;], where Dj; = O if i # j and Dj; is either 1, —1, or a 2 x 2 matrix of the 
cos(f) | 
form : 


—sin(t) cos(f) 


Example Let n be a positive integer. If 0 < c < 1 is a real number, the matrix 


I —J1l—c)I 
(vo) ( ) € Moanx2n(R) is orthogonal, where J denotes the 


(V1 —e)l (Jo)l 


identity matrix in M, »(R). 


Example Let n be a positive integer and let V = R”, on which we have defined 
the dot product. If v € V is a normal vector, then the matrix A= J — 2(uA v) isa 
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Householder matrix. These matrices are clearly symmetric. Moreover, if A = J — 
2(vAv) then A? A = A2 = (J —2[vAv])? = 1 —4v(v" v)v? +.4u(v7 v)v? = 1 and 
so A is orthogonal. Householder matrices have important uses in numerical analysis. 
We should also mention that if u ¢ v are vectors in V satisfying ||u|| = ||v||, then 
the vector w = ||v — u||~!(v — w) defines a Householder matrix A = J — 2(w A w) 
satisfying Au = Av. Since a Householder matrix is totally determined by one vector, 
it is easy to store in a computer. One of the important uses of Householder matrices is 
to compute QR-decompositions of matrices in a manner far more stable numerically 
than via the use of the Gram—Schmidt method. 


Alston Householder, a twentieth-century American mathematician, 
was among the pioneer researchers of the numerical analysis of ma- 
trices using computers, who developed many of the basic algorithms 
used in this field. 


The complex analog of Householder matrices are matrices of the form J — 2ww”, 
where w € C”. Such matrices are Hermitian and unitary and, too, have an important 
role in numerical computation. 


Example A general method for the construction of orthogonal matrices, due to the 
contemporary American mathematician George W. Soules, is given as follows: Let 
n > | be an integer and let w; € R” be a normal vector all of the entries of which are 


all positive. Let 1 < k <n and write w; = ie where u € Ré and v € R"~*. Set 


au 


lvl 
u —a-lv 


a= tal and w2 = . Then it is easy to see that w2 is normal and orthogonal 


to w,. Moreover, by further partitioning the vectors au and —a~!v, we can even- 


tually construct a mutually-orthogonal normal vectors w ), w2,..., W,. The matrix 
with these vectors as columns is then orthogonal. 


Notice that if F is either R or C, and if A € Myxn(F) is a unitary ma- 
trix the columns of which are vj,..., U,, then the identity AA# =I implies that 
{v,,..., Un} is an orthonormal set of vectors in F”, on which we have the dot prod- 
uct, and hence it is a basis for this space. Conversely, if {v,,..., v,} is an orthonor- 
mal basis of F” then the matrix the columns of which are these vectors is unitary. 
Similarly, a matrix in. M,,(R) is orthogonal if and only if the set of its columns 
forms an orthonormal basis for R” with the dot product. Another way of putting this 
is that a matrix in. M,,.,(R) the columns of which are vj,..., v, is orthogonal if 
and only if )07_, u, Au; = 1. 
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Proposition 18.4 Let V be an inner product space of finite dimension n 
over R. Let a be a unitary automorphism of V, which is represented by a 
matrix A € Mnxn(R) with respect to a given orthonormal basis of V. Then 
|A|=+1. 


Proof We know that if a is represented by A = [a;;] with respect to the given basis, 
then a* is represented by A’. From Proposition 18.1, we deduce that AA’ = J and 
so |A|? =|A|-|A?| =|Z| = 1, which in turn implies that |A| = +1. 


Example The converse of Proposition 18.4 is false, even for matrices the columns 
0.25 0 


of which are orthogonal. Thus, the matrix 6. 4 


has determinant 1, but does 


not represent a unitary automorphism of R?. 


The orthogonal matrices in M,.,(R) having determinant equal to 1 are known 
as the special orthogonal matrices, and the set of all such matrices is denoted by 
SO(n). This subset of My xn(R) is clearly closed under taking products as well as 
taking inverses, since if A € SO(n) then |A~!| = |A?| =|A| = 1. If AG Myxn(R) 
is a special orthogonal matrix, where n is an odd integer, then 1 € spec(A). To see 
this, we note that |A — 1J| =|A—J|=|A—AA?|=|A|-|2-—A™|=|1—A7|= 
|{ — A| and, since n is odd, |J — A] = (—1)”|A—1|. Thus we must have |A — /| = 0, 
and so | € spec(A). 


Example We have already noted that the only orthogonal matrices in. M2,2(R) are 
cos(t) sin(t) cos(t) sin(t) 

—sin(t) cos(f) sin(t) —cos(f) 

of the first type are special, whereas matrices of the second type are not. 


of the form for some t € R. Matrices 


0 -i1 0 0 O 
1 0 00 0 
Example The matrix ]0 O -1 0 OO] €Ms5x5(R) is special orthogonal. 
0 oO O01 0 
0 oO 00 -!1 


Let V be an inner product space. An endomorphism a € End(V) is normal if 
and only if a@* exists and satisfies a*a = aa*. From this definition, it is clear that a 
is normal if and only if a* is normal. Clearly, selfadjoint endomorphisms of V are 
normal, as are unitary automorphisms. 


Example If a,b € R satisfy b 4 0 and a? + b? ¥ 1, then the automorphism a of R? 


defined by 4 aa ae ‘A is normal but neither unitary nor selfadjoint. 
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Example If 0 4a, b € R then the automorphism a of C? defined by 


JE 


is normal but neither unitary and nor selfadjoint. 


Proposition 18.5 Let V be an inner product space. An endomorphism 
a €End(V) for which a* exists is normal if and only if ||a(v)|| = ||a*(v)|| 
forallve V. 


Proof If a is normal and v € V, then ||a(v)||? = (a(v),a(v)) = (v, a*a(v)) 
(v,aa*(v)) = (v,a*a*(v)) = (a*(v),a*(v)) = |la*(v)||? and so lor(v) || = 
||~*(v) ||. Conversely, assume that this condition holds. Then for each v € V we have 


((aa* —a*a)(v), v) = (aa*(v), v) - (a*a(v), v) 


= (a*(v), a*(v)) — (a(v), w(v)) =0. 


But aa* — a*a is selfadjoint and so, by Proposition 17.3, we see that 
aa* — a*a = 00, and so aa* =a*a. 


As a consequence of Proposition 18.5 we see that if @ is a normal endomor- 
phism of an inner product space V and if v € V then v € ker(a) } |la(v)|| =0o 
|a*(v)|| =0 > v € ker(a*) and so ker(a@) = ker(a*). 

We now take a short look at the extensive theory of eigenvalues of normal 
endomorphisms of inner product spaces. We will restrict our attention to finite- 
dimensional spaces, since the theory for infinite-dimensional spaces requires ad- 
ditional topological assumptions. 


With kind permission of the American Mathematical Society. 


The study of eigenvalues of normal and selfadjoint endomorphisms of 
t, inner product spaces was developed simultaneously by the American 
, mathematician Marshall Stone and by John von Neumann, inspired 


by problems in quantum theory. 
Proposition 18.6 Let V be an inner product space and let a € End(V) be 
normal. Then every eigenvector of a is also an eigenvector of a* and if c is 
an eigenvalue of a then C is an eigenvalue of a*. 
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Proof If v € V then, as we have noted, ||a(v)|| = ||w*(v)||. For a scalar c, we see 
that 


(@ — coi)" (a@ — coy) = (a* — C01) (@ — co1) = (@ — c01)(a@* — C01) 


= (a — co\)(a — co})*, 


and so a@ — co, is also normal. Thus, ||(@ — co,)(v)|| = ||(@* — €o1)(v)|| for v € V 
and so, in particular, we see that v € ker(@ — co;) if and only if v € ker(a* — Co). 
In other words, v is an eigenvector of a associated with the eigenvalue c if and only 
if it is an eigenvector of a* associated with the eigenvalue Cc. 


Since a** = a for any endomorphism @ of V, we see from Proposition 18.6 that 
if w is normal then a scalar c is an eigenvalue of a if and only if c is an eigenvalue 
of a*. 

Another interesting consequence of Proposition 18.6 is the following: Let V 
be a finitely-generated inner product space and let a € Aut(V) be unitary. Then 
a is surely normal. If c € spec(a@) then c 4 0 since @ is an automorphism. If v 
is an eigenvector associated with c then v = (a*a)(v) = a*(cv) = ca*(v) and so 
a*(v) =c7!v. This shows that c~! is an eigenvalue of w* and hence, by Proposi- 
tion 18.6, c7le spec(a@). 


Example In Proposition 17.5, we saw that if a is a selfadjoint endomorphism of an 
inner product space V finitely generated over R, then spec(a) 4 @. This is not nec- 
essarily true for normal endomorphisms of inner product spaces which are not self- 
adjoint. For example, let V = R* together with the dot product, and if w € End(V) 


b 
can easily check that a is normal but not selfadjoint. 


is defined by a: | a wy , then we have already seen that spec(a) = @. One 


Proposition 18.7 Let V be an inner product space finitely generated over C 
and let a € End(V). Then a is normal if and only if it is orthogonally diago- 
nalizable. 


Proof Assume that @ is normal. We will proceed by induction on n = dim(V). 
First, assume that n = 1. Since we are working over C, we know that spec(a) # 
@ and so there exists a normal eigenvector v; of a. Then V = Cv, and we are 
done. Now assume that 1 > | and that the result has been proven for subspaces of 
dimension n — 1. Again, there exists a normal eigenvector v; of a. Set W = Cv. 
The subspace W of V is invariant under a, and so, by Proposition 18.6, it is also 
invariant under a*. Therefore, W~ is invariant under a** = a. The restriction of 
a to W+ is a normal endomorphism, the adjoint of which is the restriction of a* 
to Wt. By induction, we know that there exists an orthonormal basis {v2,..., Un} 
composed of eigenvectors of a, and so {v1,..., Un} is the basis of V we are seeking. 
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Conversely, assume that there exists an orthonormal basis of B = {v1,..., vy} 
composed of eigenvectors of a. Then gz (a) = [a;;] is a diagonal matrix satisfying 
the condition that each a;; is an eigenvalue of a. Moreover, gz (a*) = Peg (a)# 
and this too is a diagonal matrix. Since diagonal matrices commute, we see that 
aa* = a*a, and so a@ is normal. 


Note that Proposition 18.7 does not imply that if V is an inner product space 
finitely generated over C and if a € End(V) is normal, then every basis B of V 
composed of eigenvectors of a is necessarily orthonormal or that its elements are 
even necessarily mutually orthogonal, merely that one such basis exists. 


Example Let a be the endomorphism of C* represented with respect to the canon- 


12 0 4O 
. : : —2 10 =O ; H 
ical basis by the matrix A = 003 -1} One easily checks that AA® = 
0 0 1 3 


A® A, and so @ is a normal automorphism of C+. The characteristic polynomial of 
A is 


X* — 8x? +27X? — 50X + 50 = (X” — 6X + 10)(X* — 2X +5), 


and so spec(a) = {3 +i, | + 27}. The set 


—i i 0 
1 1 1 | 1 1 0 1 
J2| 9 |’ f2} 90)’ 2} 1 

0 0 i 


~rP OO 


is an orthonormal basis for C+ composed of eigenvectors of a. 


Proposition 18.8 Let V be a finitely-generated inner product space. Then the 
following conditions on a projection a € End(V) are equivalent: 

(1) @ is normal; 

(2) a is selfadjoint; 

(3) ker(a) = im(a)+. 


Proof (1) => (2): From (1) we know that ||@(v)|| = ||@*(v)|| for all v € V. In par- 
ticular, a(v) = Oy if and only if w*(v) = Oy so that ker(a) = ker(a*). If v € V and 
w = v—a(v) then a(w) = a(v) — a?(v) = a(v) — a(v) = Oy and so a*(w) = Oy. 
Therefore, a*(v) = a*a(v) for all v € V, whence a* = a*a. This implies that 
a=a** = (a*a)* =a*a** = a*a = a*, which proves (2). 

(2) => (3): If v, w € V then, from (2), we see that (a(v), w) = (v,a(w)). In 
particular, if v € ker(@) then (v,a(w)) = 0 for all w € V, which is to say that 
v €im(a)+. Conversely, if v € im(#)+ then (v,a(w)) = 0 for all w € V, which 
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implies that w(v) is orthogonal to every element of V. Therefore, a(v) = Oy and so 
v € ker(a). This proves (3). 

(3) => (1): Let v, w € V. Since a is a projection, we have v — a(v) € ker(a). It 
is also clear that a(w) € im(a@). Therefore, 


0O= (v —a(v), a(w)) = (v, a(w)) — (a(v), a(w)) = (v, a(w)) _ (v, a*a(w)), 


and since this is true for all v, w € V, we have a = a*a. This implies that a = a* is 
selfadjoint and therefore surely normal, proving (1). 


We note that if V is a finitely-generated inner product space and if aw € End(V) 
is normal, then, by Propositions 16.5, 16.7 and 18.5, we have V = ker(a) @ im(q@), 
and, in particular, {im(@), ker(~)} is an independent set of subspaces of V. More- 
over, v | v’ for all v € ker(a) and v’ € im(a). While the direct-sum decomposition 
is valid for any projection, it is the normality which ensures the orthogonality. 


Proposition 18.9 Let V be a finitely-generated inner product space. Let 

W\,..., Wy be subspaces of V and, for each 1 <i <n, let a; be the projec- 

tion of V onto the subspace W; coming from the decomposition V = W; ® we. 

Then the following conditions are equivalent: 

(1) V= Qi Wi and Wi = @ zn Wj for all l<h<n; 

(2) a1 +--+ +a, =o; and aja; = 09 for alli F j; 

(3) If Bj is an orthonormal basis of W; for each i, then B = (i 4 B; is an 
orthonormal basis of V. 


Proof This has essentially already been established when we talked about the de- 
composition of a vector space into a direct sum of subspaces. 


Proposition 18.10 Let F be either R or C and let V be a finitely-generated 
inner product space over F.. If p(X) € F[X] and if a is a normal endomor- 
phism of V,, then p(a) is anormal endomorphism of V. 


Proof If p(X) = )-7_9 ai X'. Then p(a) = > i=0 aja! and p(a)* = > i=0 ai (a*)!. 
Since wa* = a*a, it follows from the definition of the product that p(a) p(a)* = 
p(a)* p(a). Therefore, p(a) is a normal endomorphism of V. 


Proposition 18.11 Let V be a finitely-generated inner product space and let 
a be anormal endomorphism of V. If the minimal polynomial of a is com- 
pletely reducible, then it does not have multiple roots. 


18 Unitary and Normal Endomorphisms 429 


Proof Let p(X) be the minimal polynomial of a, which we assume is completely 
reducible. Assume that there exists a scalar c and a polynomial g(X) such that 
p(X) = (X — c)?q(X). Since p(w) = 00, we have (a — co1)*q(a) = 09 and so 
ker((a — co\)*q(a)) = V. By Proposition 18.10, we know that 6 = a — co, is a 
normal endomorphism of V. Let v € V and let w = q(@)(v). Then B’(w) = Oy and 
so B(w) € im(B) N ker(B) = {Oy}. Thus we see that Bg(a@)(v) = Oy for all ve V 
and hence @ annihilates the polynomial (X — c)q(X), contradicting the minimality 
of p(X). 


Proposition 18.12 (Spectral Decomposition Theorem) Let V be an inner 
product space finitely generated over C and let a be anormal endomorphism 
of V. Then there exist scalars c,..., Cy, and projections 01, ..., Qn of V sat- 
isfying: 

(1) a@=cyay +++ + Cnn; 

(2) op =, +--- +n; 

(3) naj =o0 for allh F j. 

Moreover, these c; and a; are unique. The cj are precisely the distinct eigen- 
values of a and each a jis the projection of V onto the eigenspace W; associ- 
ated with c; coming from the decomposition V = W; ® Wes 


Proof Let p(X) be the minimal polynomial of a, which we will write in the form 
p(X) = []_,(X — ci), where the cj are complex numbers which, by Proposi- 
tion 18.11, are distinct. For each 1 < j <n, let p;(X) be the jth Lagrange in- 
terpolation polynomial determined by the c;. 

Let f (X) be a polynomial of degree at most n — 1. Then the polynomial f (X) — 
yr f (ci) pi (X) is of degree at most n — 1 and has n distinct roots ci, ..., Cn. Thus 
it must be the 0-polynomial and so f(X) = an F (ci) pi (X). In particular, we see 
that 1 = )°7_, pj(X) and X = Y77_, cj pj(X). Set aj = pj(@). Then 0; = )77_) a; 
and a = )*;_, cja;. Note that aj # oo since aj = pj; (a) and the degree of p(X) 
is less than the degree of the minimal polynomial of a. Moreover, if i ~ j then 
there exists a polynomial u(X) € CLX] satisfying a,a; = u(a)p(a) = u(a)oo = 
o. Thus we see that for all 1 < j <n we have aj = ajo, = ))j_, aja; = as and 
so each a; is a projection. Thus we see that {im(a;) | 1 < j <7} is an independent 
set of subspaces of V. 

Since the minimal polynomial and the characteristic polynomial of a have the 
same roots, we know that spec(a) = {c1,..., Cn}. To show that W;, = im(a,), we 
have to prove that a vector v belongs to im(q@,,) if and only if a(v) = cyv. In- 
deed, if a(v) = cyv then chlo in aj(v)] = chv = a(v) = Vai (cjaj)) and 
so Vaca — cj)aj](v) = Oy. Thus, for all j 4h, we have aj(v) = Oy and so 
v = a;,(v) € im(a@p). 

Finally, we note that a, is the projection coming from the decomposition 
V=W;® W+ since a is a polynomial in a and hence normal and so the result 
follows from the remark after Proposition 18.8. 
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Note that we could have deduced Proposition 18.12 directly from Proposi- 
tion 18.7. What is important in the above proof is the explicit construction of the 
projection maps as polynomials in @. 

If a = )-"_, cia; is as in Proposition 18.12, then ak = ()*”_, cia;)* = 
4 cha for any positive integer k, and from this we see that if p(X) € C[X] 
then p(w) = )j_, p(ci)ai- 


Proposition 18.13 Let V be an inner product space finitely generated over C. 
A normal endomorphism a of V is positive definite if and only if each of its 
eigenvalues is positive. 


Proof Tf a is positive definite then, by Proposition 17.11, each of its eigenvalues 
is positive. Conversely, assume each of the eigenvalues of a is positive. By Propo- 
sition 18.12, we write a = an cja;, where the c; are the eigenvalues of a and 
the a; are projections in End(V) satisfying aja; = 09 fori 4 j. If OV Ave V 
then (a(v), v) = Fy i=l ci (ai (v), @j(v)) = 1, cilloi(v) ||? > 0 and so a is 
positive definite. 


Example Let V = IR3. For each a € R, let a, € End(V) be the normal endo- 

morphism of V represented with respect to the canonical basis by the matrix 
laa 
a 1 aj. Then spec(a) = {2a + 1, 1 — a} and so, by Proposition 18.13, @ is 
aa 1 

positive definite precisely when —1 < 2a <2. 


As a consequence of Proposition 18.13 and the comments before it, we see that 
if w is a positive-definite endomorphism of a finitely-generated inner product space 
V over C then there exists an endomorphism ./a of V satisfying (./o)” = a. This 
endomorphism is defined by ./a@ = )~_, (,/ci)ai, where the c; are the eigenvalues 
of a, and where the a; are defined as in Proposition 18.12. In particular, if 6 is an 
automorphism of V then, by Proposition 17.10, we can talk about ./B*, which is 
also positive definite by Proposition 18.13. 


Proposition 18.14 Let V be an inner product space finitely generated over C 
and let a € Aut(V ). Then there exists a unique positive-definite automorphism 
6 of V and a unique unitary automorphism of V satisfying a = ye. 


Proof By Proposition 17.10, we know that the automorphism a*a of V is posi- 
tive definite and so we can set 06 = Va*a. Let gy = 0a~!. Then g* = (a~!)*9* = 
(a*)—'6 so g*p = (a*)!00a-! = (a*)~!a* aa! = 04, proving that ¢g is unitary 
by Proposition 18.1, and hence belongs to Aut(V). If we now define 7 = y~!, we 
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see that a = y@. Moreover, we note that 9 € Aut(V) since 6 = ga. To prove unique- 
ness, assume that yO = w’0’, where w and w’ are unitary automorphisms of V and 
where 6 and @’ are positive-definite automorphisms of V. Then wy? = W0*0y = 
w'(0')*6'w' = (W’)*. Since w is positive definite, this implies that y = w’ and so, 
since y is an automorphism, we have 0 = y~!w0 = w—!w6’ =6’. 


The representation of an automorphism a of an inner product space finitely gen- 
erated over C in the form given in Proposition 18.14 is sometimes called the polar 
decomposition of a.' If we move over to matrices, we see that the polar decom- 
position of a nonsingular matrix A € My xn(C) is of the form A = UM, where U 
is a unitary matrix and M is a positive-definite Hermitian matrix. Similarly, there 
exists a unitary matrix U’ and a Hermitian matrix M’ satisfying A” = U’M’' and 
so A = M’(U’)", where (U’)# is again unitary. In the case we are working over R, 
the matrix U is orthogonal, and M is symmetric and positive definite. Because po- 
lar decompositions are important in applications, several iterative algorithms exist 
to compute them. 


Example If a and b are nonzero real numbers, then the polar decomposition of the 


. |a —b]. |cos(@) —sin(@@)}}r 0 _ b _ 
matrix E | bee | E ult where 6 = arctan(>) and r = 
Va? + b?. 


Proposition 18.15 (Singular Value Decomposition Theorem) Let V and 
W be inner product spaces of finite dimensions k and n, respectively, and 
let a €Hom(V, W). Then there exists an integer t < min{k,n}, together 
with positive real numbers c, > cz > +--+: => cy and with orthonormal bases 
{u1,..-, Uk} of V and {w1,...Wn} of W satisfying 


cw, ifl<i<t, 
aya pour Fl sis 
Ow otherwise 
and 


ots cui ifl<i<t, 
Oy — otherwise. 


Proof Tf a is the O-map, then the result is immediate, so assume that is not 
the case. We note that 6 = a*a is a selfadjoint endomorphism of V and so, 
by Proposition 17.7, it is orthogonally diagonalizable. Hence V has an orthonor- 
mal basis {v1,..., vg} composed of eigenvectors of 6, where each v; is associ- 
ated with an eigenvalue b;. By Proposition 17.10, we know that each b; belongs 


‘Polar decompositions were first studied by the French engineer Léon Autonne at the beginning 
of the twentieth century. 
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to R. Moreover, for each i we note that b; = b;(v;, vj) = (bj; v;, vj) = (B(V;), v7) = 
(a(v;), @(v;)) = 0. Indeed, renumbering if necessary, we can assume that there ex- 
ists an integer t < k such that b} > by > --- > b; > O while bj41 =--- = by = 0. 
For each 1 <i <f, set cj = Jb; and let w; = cy 'a(vj) eW. If i ~ j then 
(wz, wy) = (ciej)—" (ae(uj), (vs) = (cre) (B (uy), vj) = (rej) 1B; (v;, vj) = 0 
while, for each | <i <t, we have (w,;, w;) = c, (a(v;), a(v;)) = c) 7 (B(vi), Vj) = 
c) (divi, vj) = (v;, vj) = 1. Thus we see that the set {w),..., w;} is orthonormal. 
Moreover, for each 1 <i <t we have lla (v;) ||? = b; so ||a(v;)|| = c; and a*(w;) = 
a*(c; 'or(v;)) = c, 'a*a(v;) = c, 'B(v) = c, | bp =cjv;j. Fort+1<i<k we 
have a*a(v;) = B(v;) = Oy and so 0 = (B(v;), vj) = (a@(v;), a(v;)), which implies 
that a(v;) = Ow. Thus v; € ker(aw) for eacht +1 <i<k. 


We are therefore left with the matter of defining w;+1,..., wy, in the case t <n. 
By Proposition 16.18, we know that ker(a@*) = im(a)+ and so, if we pick an or- 
thonormal basis {w;+1,..., Wn} for ker(a*) we see that {w 1, ..., wy} is an orthonor- 


mal basis for W having the desired properties. 


The first version of the Singular Value Decom- 
position Theorem was proven by the nineteenth- 
century Italian mathematician Eugenio Beltrami; 
it was subsequently extended by many others, in- 
cluding Camille Jordan and Sylvester. Schmidt 
extended this theorem to infinite-dimensional 
spaces. Effective algorithms for computation of 
singular value decompositions were developed by 
the twentieth-century American computer scientist Gene H. Golub, along with William 
Kahan. 


The scalars cj > c2 >--- > cy given in the Proposition 18.15 are called the sin- 
gular values of the linear transformation a. The number c1/c;, called the spectral 
condition number, is used as a measure of the numerical instability of the matrix 
representing a*a € End(V) with respect to the given basis. 

If we consider the special case of a linear transformation a : Ck > C” repre- 
sented with respect to the canonical bases by a matrix A € M, x (C), the Singular 
Value Decomposition Theorem says that there exist unitary matrices P € Myxn(C) 


and QO € Mxxx(C) such that A can be written as P le 4 oF, where De 


Me x+(R) is a diagonal matrix having the singular values of a on the diagonal. 
These singular values are precisely the square roots of the eigenvalues of A” A. The 
columns of Q form an orthonormal basis for C¥ consisting of eigenvectors of A” A, 
and the columns of P form an orthonormal basis for C”. 

If a : R* > R’ then, of course, the matrices P and Q are orthogonal and 


D O 
A=P[ 6 als 
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20 20 —20 20 
Example The matrix A = + 1 17 1-17 | can be written as a product 
18 6 18 —6 


4 0 0 0 
P|}0 3 0 O|] Q, where 
0 0 2 0 


1 1 -1 1 

q|/> 2 9% oe | 
P=-—|0 3 —4 and Q=-— 

5 0 4 3 2;/1 -l1 1 1 

1 -1 -1 -1 


are orthogonal and where the singular values of A are 4, 3, 2. 


Singular value decompositions have many applications, and play important roles 
in the mathematics of optimization, data compression, population genetics, and im- 
age processing. They are especially useful since accurate and relatively-efficient 
algorithms for computing these decompositions are readily available in many com- 
mon linear-algebra software packages. In particular, in many applications one needs 
to compute the singular value decomposition of a product of a large number of ma- 
trices (often over 1,000) and there exist algorithms to do that without having to 
multiply out the matrices explicitly. 


Exercises 


Exercise 1108 
Let A € My xn (C) be similar to a unitary matrix. Is A7! necessarily similar 
oA"? 


Exercise 1109 
Let n be a positive integer and let A € My »(C) be a nonsingular matrix having 
a singular value decomposition A = PDQ", where P and Q are unitary matri- 


Cl O 

ces and D= om is a diagonal matrix with c} >--->cp. If Bisa 
O Cn 

singular matrix, show that ||A — B|| = cy, where || - || denotes the spectral norm. 


Exercise 1110 

Let n > | be an integer and let V be the subspace of C[X] consisting of all 
polynomials of degree at most n. Let 0 #c € C and let a be the endomorphism 
of V defined by a: p(X) + p(X +c). Is it possible to define an inner product 
on V relative to which @ is normal? 
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Exercise 1111 
Let a, b,c € C. Find the set of all triples (x, y, z) of complex numbers satisfying 


ax y 
the condition that the matrix | 0 5b = z | represents a normal endomorphism 
0 0 ¢ 


of C?, endowed with the dot product, with respect to the canonical basis. 


Exercise 1112 
Show that any Givens rotation matrix in M>,.2(R) can be written as the product 
of two Jacobi reflection matrices. 


Exercise 1113 

Let n be a positive integer. A matrix A € My xn(C) is normal if and only if 
A” A= AA". Show that every normal upper-triangular matrix is a diagonal ma- 
trix. 


Exercise 1114 

Let n be a positive integer and let A € M,,x,(R). Then A is normal if and only 
if A A = AA’. If A is normal, is e“ normal? Is the converse of this statement 
true? 


Exercise 1115 
Let V=R?, together with the dot product. Show that a matrix in. M2,2(R) is of 
the form ®zg g(a) for some normal endomorphism a of V which is not selfadjoint 


; aA for real numbers a and b 4 0. 


if and only if it is of the form 
Exercise 1116 

Let V be an inner product space finitely generated over R and let S be the set of 
all isometries V. Is S an R-subalgebra of End(V)? 


Exercise 1117 

Let n be a positive integer and let V = C” on which we have the dot product. If 
a € End(V), let G(a@) = {(a(v), v) | ||v|] = 1}. For the special case n = 2, find 
G(a) and G(8), where @ is represented with respect to the canonical basis by 


_ {1 0 : : j : 
the matrix and 6 is represented with respect to the canonical basis by 


0 0 
_ 10 2 
the matrix E al : 


Exercise 1118 

Let V = R? on which we have the dot product, and let W be the space of all 
polynomial functions in R® of degree at most 2, on which we define the inner 
product (f, g) = i Ft (x)g(x) dx. Let a e Hom(V, W) be defined by 
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a 
b 
a b ee 1+ St E+ cx + cx. 
c 


Is this linear transformation an isometry? 


Exercise 1119 
Let V be an inner product space and let a be an endomorphism of V satisfying 
the condition that w*a = oo. Show that a = oo. 


Exercise 1120 
Let V = R? with the dot product, and let a be the automorphism of V defined by 


a —c 
a:| b|+t> | —b |. Isq@ unitary? 
c —a 


Exercise 1121 


000 i 
. {0 0 1 0 ; 
Is the matrix 010 0I\¢ Maxa4(C) unitary? 
i 00 0 


Exercise 1122 
Find a real number a satisfying the condition that the matrix 


~9 4+ 8i 10—4i -—16—18i 
a| —2-241 1412i -10-—4i | €M3,3(C) 
4—10i 2—24i 9+ 8i 


is unitary. 


Exercise 1123 
Find a real number a satisfying the condition that the matrix 


12 6—12i 12+61 6-6 


1 | 64 12i a 5i 3+i 
| 12—6% Si 2 t92/° ee 
6+ 61 3-i 1+3i —22 
is unitary. 
Exercise 1124 
Given a real number a, check if the matrix 
—sin‘(a)+icos*(a) (1 +i)sin(a)cos(a) 
te ia 2 ae) € M2x2(C) 
(1 +i) sin(a)cos(a) —cos*(a) +isin“(a) 


is unitary. 
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Exercise 1125 
Find all possible triples a, b, c of real numbers, if any exist, such that the matrix 


1 —2 2 
3] 2 —1 2 | is orthogonal. 
a bc 


Exercise 1126 


1 —2a 2a? 
For which a € R is a 2a 1—2a* 2a | €.M3x3(R) an orthogonal 
2g” =~ a 1 
matrix? 
Exercise 1127 
1 1 1 1 
Is th a 2 R) orth 1? 
s the matrix 5 1-1 1-1 € M4x.4(R) orthogonal? 
1 -1 -1l 1 


Exercise 1128 


Ifvu= H € R?, show that there exists an orthogonal matrix A € M2,.2(R) and 


a real number b satisfying the condition that Av = Hi 


Exercise 1129 
Let a and b be real numbers, not both equal to 0. Show that the matrix 


ab a(a+b) b(a+b) 
—>—— | a(a+b) —b(a+b) ab € M3 x3(R) 
a? +ab+b? | 444 5) ab =<604)) 


1 


is orthogonal. 


Exercise 1130 
2a —2a a 


Find all a € R such that the matrix | —2a —a_ 2a} € M3 x3(R) is orthog- 
a 2a 2a 
onal. 


Exercise 1131 


4 -l 1 
Let A=] —-l 4 -1 | €M3,x3(R). Find an orthogonal matrix P such that 
1 -l 4 


P’ AP isa diagonal matrix. 
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Exercise 1132 


1 00 0 
0 0 1 0 : : 

Let A= 010 0 € M4x4(R). Find an orthogonal matrix P such that 
0 0 0 1 


P’ AP isa diagonal matrix. 


Exercise 1133 
Let n be a positive integer and let A and B be orthogonal matrices in M,,.,(R) 
satisfying |A| + |B| = 0. Show that |A + B| =0. 


Exercise 1134 
1 -1 272 
Let A= 5 2/2 2/2 0 € M3,3(R). Find an infinite number of pairs 
=f 4, 32 


(P, Q) of orthogonal matrices such that P AQ is a diagonal matrix. 
Exercise 1135 


4 

z 0 

3 
Find an a € R such that the matrix a 0] €M3x3(R) is orthogonal. 
0 1 


ous & 


Exercise 1136 
Let A, Be Mgyxn(R) be matrices such that the columns of each form orthonor- 
mal bases for the same subspace W of R*. Show that AA’ = BB’. 


Exercise 1137 


Let A,B € My xn(R) be orthogonal matrices. Is the matrix E | € 


Monx2n(R) necessarily orthogonal? 


Exercise 1138 
Let n be a positive integer and let A € Myx» (IR) be a skew-symmetric matrix. 
Show that (A — 1)~!(A + J) is an orthogonal matrix which does not have | as 


an eigenvalue. 


Exercise 1139 


Find two distinct functions fi, fo: R* \ — R satisfying the condition 


oooco 


that 
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PLP acta 2(bc — da) 2(bd — ca) 
2(bc — da) a+b? —c*-—d? 2(cd — ba) 
2(bd — ca) 2(cd — ba) a2 +b? —c* —d? 


No ea 


is always an orthogonal matrix. 


Exercise 1140 
Let n be a positive integer and let A, BE My xn(R) satisfy A? + B? =1.Is the 


matrix ae 
B A 


€ Monx2n(R) necessarily orthogonal? 

Exercise 1141 

Let O 4 A € M3x3(C) be a matrix satisfying adj(A) = A”. Show that A is a 
unitary matrix having determinant |. 


Exercise 1142 
Let 1 be a positive integer and let a be the endomorphism of C” defined by 
a@:ut> iv. Isa normal? 


Exercise 1143 
Let V be an inner product space and let a, 6 € End(V) be normal. Is Ba neces- 
sarily normal? 


Exercise 1144 

Let V be an inner product space finitely-generated over C and let a € End(V) 
satisfy the condition that every eigenvector of 6 = a + a* is also an eigenvector 
of y =a — a*. Prove that @ is normal. 


Exercise 1145 
Let w be the endomorphism of C? represented with respect to the canonical basis 


by the matrix A = E a Is a normal? 


Exercise 1146 
Let V be an inner product space over C and let a € End(V) be normal. If c € C, 
is the endomorphism a@ — co, necessarily normal? 


Exercise 1147 
Let V be an inner product space finitely generated over C and let 09 Aa € 
End(V) be normal. Show that @ is not nilpotent. 
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Exercise 1148 
Let a,b € R let w € End(R?) be represented with respect to the canonical ba- 


a —2 b 
sis by b a —2|. For which values of a and b is this endomorphism 
—2 3 a 


normal? 


Exercise 1149 
Let w € End(R?) be represented with respect to the canonical basis by the matrix 


14 2 14 
5 2 —1 -—16 |. Show that @ is selfadjoint and find an orthonormal basis 
14 —-16 5 


of R? composed of eigenvectors of a. 


Exercise 1150 
Let aw be the endomorphism of C? represented with respect to the canonical basis 


6 —-2 3 
by the matrix 3 6 —2 |. Show that qa is normal and find an orthonormal 
—2 3 6 


basis of C? composed of eigenvectors of a. 


Exercise 1151 

Let V be an inner product space and let og 4 a € End(V) be a normal projection. 
Show that ||a@(v)|| < ||v|| for all v € V, with equality whenever v € im(a). Give 
an example where this does not hold for a which is not normal. 


Exercise 1152 
Let n be a positive integer and let F be any field. A matrix A € Myxn(F) is 


antiorthogonal if and only if A~! = —A7. Give an example of an antiorthogonal 
matrix in M»2,.9(GF(3)). 
Exercise 1153 
a a\ 
Let w € End(R*) be defined bya: nats: ae . Show that @ is normal 
a3 a3 + a4 
a4 a4 — a3 


but not selfadjoint. 


Exercise 1154 
Let a : R? — R? be the linear transformation represented with respect to the 
1 1 


canonical bases by the matrix A= | 2 2 |. Find the singular values of a. 
2.2 
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Exercise 1155 


Let n be a positive integer, let F be a field, and let J be the identity matrix in 
Maxn(F). A matrix A € Moy x29 (F) is symplectic if and only if 


If B,C € Mnxn(R), show that B+iC € My xn(C) is unitary if and only if the 


matrix - ae € Monx2n(R) is symplectic. 
Exercise 1156 
0 --1//2 d 
For which c,d € C is the matrix c 1/2 i/2 | Hermitian? For 


ijJ2 -if2 172 


which values of c and d is it unitary? 


Exercise 1157 

A polynomial p(X) € C[X] of degree n > 0 is a reciprocal polynomial if and 
only if p(X) = +X" p(X—!). Show that characteristic polynomials of orthogonal 
matrices are reciprocal and that the set of all reciprocal polynomials, together 
with the 0-polynomial, forms a subalgebra of C[X]. 


Exercise 1158 
(Cayley representation) For any real number t, with cos(t) #4 —1, find a skew- 
symmetric matrix A € M2,x2(R) satisfying 


cos(t)  sin(t) = 
. =(I[—A)U+A)™. 
—sin(t) cos(f) 
Exercise 1159 
Let V be a vector space finitely generated over C and let a be an automorphism 
of V having polar decomposition a = 6, where y is unitary and 6 is positive 
definite. Show that a is normal if and only if a = Oy. 


Moore-Penrose Pseudoinverses 1 i) 


Let V and W be inner product spaces, and let a: V — W be a linear transforma- 
tion. We know that there exists a linear transformation 8 : W — V satisfying the 
condition that Bq is the identity function on V and a is the identity function on 
W if and only if @ is an isomorphism; in this case, 8 = a~!. If both spaces are 
finitely generated, we also know that such an isomorphism can exist only when 
dim(V) = dim(W). If a is not an isomorphism, it is possible to weaken the notion 
of the inverse of a function. Given a linear transformation a : V — W, we say that 
a linear transformation 6B : W — V is a Moore—Penrose pseudoinverse of a if and 
only if the following conditions are satisfied: 

(1) aba =a and Bas = Bp; 

(2) The endomorphisms Ba € End(V) and wf € End(W) are selfadjoint. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach (Penrose). 

Eliakim Hastings Moore developed this construction in 1922, but it did 
not receive much attention at the time; it was rediscovered independently 
in 1955 by Sir Roger Penrose, a contemporary British applied mathe- 
matician, best known for his collaboration with the physicist Stephen 
Hawking. 


Example The two parts of condition (2) in the definition of the Moore—Penrose 
pseudoinverse are independent. To see this, consider the linear transformation 


a : R? > R? defined by a: vb E : a For any c,d € R, let 8B: R* > R? 
1—3c -—2—3d 
be the linear transformation defined by 6 : wh 0 1 . Then one 


c d 
can check that wa = a and BaB = f and that aB = 0; in End(R7). On the other 
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1—3c 6c—3d 3-—9c 
hand, Ba: vt 0 1 0 v, and it is easy enough to choose 
c 2c+d 3c 
c and d so that this matrix is not symmetric and hence fq is not selfadjoint. 


We will denote the Moore—Penrose pseudoinverse of a by w*. Of course, in order 
to justify this notation we have to show that 6 exists and is unique, which we will 
do for the case that V and W are finitely generated. We will begin with uniqueness. 


Proposition 19.1 Let V and W be inner product spaces and leta: V > W 
be a linear transformation. If « has a Moore—Penrose pseudoinverse, it must 
be unique. 


Proof Suppose that 6B, y ¢ Hom(W, V) are Moore—Penrose pseudoinverses of a. 
Then 6 = Bap = (Ba)*B = a*B*B = (aya)*B*B = (ya)*a* B*B = yao" B* B= 
ya(Ba)*B = yopap = yap = yayap = y(ay)"ap = yy*a*aB = yy*a* (aB)* 
= yy*(aBa)* = yy*a* = y(ay)* = yay = y and so we have proven unique- 
ness. 


In particular, if a : V — W is an isomorphism, then, by Proposition 19.1, we 
have a+ =a7!. If w is the 0-function then so is a*. 


Proposition 19.2 Let V and W be finitely-generated inner product spaces 

and let a: V — Wbe a linear transformation. 

(1) Ifa isamonomorphism, then a* exists and equals (a*a)~!a*. Moreover, 
ata is the identity function on V; 

(2) If x is an epimorphism, then a+ exists and equals a* (aa*)~!. Moreover, 
aa* is the identity function on W. 


Proof (1) From Proposition 16.20, we see that if ~ is a monomorphism then 
a*a € Aut(V), and so (a*w)~! exists. Set 8 = (a*a)~!a*. Then fa is the iden- 
tity function on V, and so fa is a selfadjoint endomorphism of V which satis- 
fies aBa = a and Bap = B. Finally, (#B)* = [a(a*a)~!a*}* = a[(a*a)~!]*o* = 
a[(c*a)*]~!w* = a(a*a)—!a* = af, and so af is also selfadjoint. Thus 6 = at. 
(2) From Proposition 16.20, we see that if w is an epimorphism then aa” € 
Aut(W) and so (wa*)~! exists. As in (1), we see that a*(aa*)~! =at. 


Example Let a : R? — R? be the linear transformation represented with respect to 
1 2 

the canonical bases by the matrix | —1 3 |. This is a monomorphism and so, by 
2 4 
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Proposition 19.2, wat exists and is represented with respect to the canonical bases 
3 -10 6 


by the matrix * k 5 2|: 


Proposition 19.3 Let V and W be finitely-generated inner product spaces. 
Then every a € Hom(V,W) has a Moore—Penrose pseudoinverse ate 
Hom(W, V). 


Proof Let Y = im(q@), and write a = 4B, where 6B: V — Y is an epimorphism 
given by B: vt» a(v), and 4: Y > W is the inclusion monomorphism. By Propo- 
sition 19.2, we know that B+ € Hom(Y, V) and w+ € Hom(W, Y) exist and satisfy 
the conditions that BB* and jz* 2 are equal to the identity function on Y. Therefore, 
we see that (4B)(B* u*) (MB) = uB and (B* u*)(uB)(B* u*) = BTU and we see 
that (Bt w+)(wB) = BB and (uB)(B* u*) = wu are selfadjoint. Thus at exists 
and equals Bt ut. 


As an immediate consequence of this, we note that if w is an endomorphism 
of a finitely-generated inner produce space V then, by Proposition 6.11, we see 
that rk(aa@t) < rk(@) and rk(aw) = rk(aat@) < rk(aat) and so rk(aa*) = rk(a@). 
Similarly, rk(at@) = rk(@). 

If F is either R or C, and if we are given a linear transformation a : F > F” 
which is represented with respect to the canonical bases by the matrix A = [a;;], 
then we will denote the matrix representing wt with respect to these bases by AT. 
Thus the matrix At has the following properties: 

(1) AAtA=A and ATAAt= At; 
(2) The matrices AAT and At A are symmetric (in the case F = R) or Hermitian 
(in the case F = C). 


1 -l 2 
Example Let A= | 2 1 —2 | € M3 x3(R). This matrix is clearly singular and 
3 0 0 
5 5 10 
hence A~! does not exist. However, we can check that At = * —5 4 -] 
10 —-8 2 


For nonsingular square matrices of the same size A and B, we know that 
(AB)~! = B~!A~!. A similar equality does not hold for the Moore—Penrose pseu- 
doinverse, as the following example shows. 
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6 


2 
Example If A = 7 44 


and B = E | in M2x2(R), then AB = ke i 


1 2 2 1 
1 1 7 . 
Then At = 6 | and Bt = [> | so BtAt = a| 4 2}: whit 


The Singular Value Decomposition Theorem can be used to compute pseudoin- 
verses. This is important since, as we have remarked previously, there exist several 
relatively efficient and stable numerical algorithms for computing such decomposi- 
tions. 


Example Let a : Rk‘ — R" be a monomorphism represented with respect to the 
canonical bases by a matrix A. By Proposition 19.2, we have At = (A? A)~!A?. 
By Proposition 18.15, we set A = PEQ’', where P € Mxyx(R) and QO € 


M«kxn(R) are orthogonal matrices and E € Mx x,,(R) is of the form E 4 


for a diagonal matrix D € M,;,,(R), the diagonal entries of which are nonzero. 
Then 


At = (ATA) 'AT =(QE"P! PEQ’) 'QE™ PT 


=(o8" EQ") og" = 0/7) 9 |etget et 


_ D" O T 
of? o] en 


1 


Example If A(t) = E 


| for all real numbers ¢ then we see that A(t)+ = 


when t ¥ 0, but is equal to for t = 0. Thus we see that not only 


0 1 0 

0 rt 0 0 

is lim;_,9 A(t) not equal to A(O), but indeed that the value of A(t) moves farther 
and farther away from A(0) as ¢ approaches 0. 


Thus we see that the Moore—Penrose pseudoinverse is not computationally sta- 
ble. This means that one has to be very careful in actual applications. Because of 
the importance and utility of Moore—Penrose pseudoinverses, there exists a consid- 
erable literature on techniques for computing A* or A* A, given a matrix A. One of 
the methods used in practice for computing the Moore—Penrose pseudoinverse over 
R is a recursive one, known as Greville’s method, which is based on the following 
result: If A € Mx (IR), and if we write A=[B v], where B € Myx (n—1)(R), then 
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Ata BYU -vAw) 
~ w 


b where 


(I — BBt)vl|)- 7 — BB*)v_ if || — BB*)v|| £0, 
(1+ ||Btoull?)-!(Bt)? Btu otherwise. 


© Mrs Greville (Greville); © Adi Ben-Israel (Ben-Israel) 


In the 1970s, the American mathematician 
Thomas N.E. Greville and the American/Israeli 
mathematician Adi Ben-Israel popularized and 
reinvigorated the use of the Moore—Penrose pseu- 
doinverse as a computational tool. 


Another technique is to break A up into blocks, if possible. Indeed, if A = 
be Ai2 


A A | where Aj, is a nonsingular square matrix the rank of which equals 
21 22 


* 
the rank of A, then, by Zlobec’s formula, we have At =[ Ai Ai2 |" B* bal , 


Anal 
A —1 
where B= ([An an]ar| 4) : 


One can also use convergence methods to compute the Moore—Penrose pseudo- 
inverse of a matrix. If A € Mzx,(C) then, by Proposition 17.4, we know that 
the eigenvalues of A*A are real. Let c be the largest such eigenvalue and pick a 
real number b satisfying 0 < be < 2. For each integer p > 2, define the sequence 
Yo, Yj,... of matrices in M,,.~(C) as follows: 

(1) Yo = bA"*; 
(2) If k > 0 and Y; has already been defined, set JT, = I — Y,A and set Yr, = 
Ye+ sae 7 Y,. Then the sequence Yo, Y,... converges to A*. 

Another method is the following: if A € Mn xn(R) is an arbitrary symmetric 
matrix we can define matrices Ao, Aj, ... by setting Aj = A and Ag4; =[J + U — 
Ag)Ud + Ag)! ]Ag for all k > 0. Also, we can define real numbers co, ci, ... by 
setting co = 1 and cj+1 = 2c; + 1 for each i > 0. Then the Kovarik algorithm states 
that if none of the numbers Cc ' is an eigenvalue of A, the sequence Ag, Aj,... 
converges to ATA. 

Let F be either R or C, let k and n be positive integers, and let A € Mxxn(F). 
We now look at what the matrix AT says about a solution (if any) to a system of 
linear equations of the form AX = w, where w € F k First of all, we note that in 
general the following proposition holds. 
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Proposition 19.4 Let F be either R or C, let k and nbe positive integers, let 
A € Mgxn(F), and let w é€ F*. The system of linear equations AX = w has 
a solution if and only if (AAt)w = w. 


Proof If there is a vector v € F” satisfying Av = w then (AAt)w = (AAt)(Av) = 
(AAT A)v = Av = w. Conversely, if (AAt)w = w then A(ATw) = w and so 
Av = w, where v = Atw. 


We also note that, in the situation above, if y ¢ F” then AU — AT A)y = 


0 

and so we also see that At w+ (J — At A)y is also a solution to the system AX = w, 

assuming that the system has any solutions at all. Conversely, any solution to this 
0 


system is of the form At w+ u, where Au =| : |,andso (J — ATA)u =u. 


Proposition 19.5 Let F be either R or C, let k and n be positive integers, 
and let A € Myxn(F). Let w € FF. If the system AX = w has a solution then 
in the set of all solutions to this system of linear equations there is precisely 
one having a minimal norm, and it is Atw. 


Proof Tf u is a solution to this system, then we have already seen that it is of the 
form At w + (I — At A)y. But we note that 

(Atw, (I — At A)y) = (At AAT w, (I — ATA)y) 
(Atw, (At A)(I — ATA)y) 
( 


Atw, (ATA —A*AATA)y) 


0 


so 


lll? = (uu) =(Atw + (I — At A)y, Atw + (I — AtA)y) 
=(Atw, Atw)+((1 — At A)y, (I — At A)y) 


= |Atw)? +) — at ayy’, 


which implies that ||u|| > || AT wl. 
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Example Let A = ol Tl andl | then Ae |e | aus 
1 1 1 2 14 
1 4 
8 
the solution to the system AX = w having minimal norm is AT w = u 11 |. Its 
9 


norm is hi / 266. 


But what happens if the system AX = w does not have a solution? Suppose that 
F is either R or C and that A € Myzyy(F), and we F*, where k and n be positive 
integers. Then the system (At A)X = A*w always has a solution, namely A*w, 
and, by Proposition 19.5, this is in fact the solution of minimal norm of this equation, 
which is the best approximation to a solution of AX = w. 


Example Consider the system of linear equations AX = w, where 


2 -4 5 1 
6 0 3 3 
A=}, _4 5} and w=] _) 
6 0 3 3 
Then 
; [-2 6 2 6 
Bite 5 3 -—5 3 and Atw=-] 1 
40 40 


In order to emphasize the use of Proposition 19.5, we briefly consider the least 
squares method, which is an important tool in many areas of applied mathemat- 
ics and statistics. This method was developed at the beginning of the nineteenth 
century by Gauss and Legendre and, independently, by the American mathematical 
pioneer Robert Adrain. Suppose that we have before us the results of several ob- 
servations, which, depending on values f),...,¢, of a real parameter, give us real 
values c1,..., Cn. Our theory tells us that the set of points {(4;,c;) | 1 <i <n} in 
the Euclidean plane should lie on a straight line. However, because of measur- 
ing and/or computational errors, this does not quite work out. So we want to find 
the equation of the line in the plane which best fits our observed data. In other 
words, we want to find a solution of minimal norm to the system of linear equations 

1 4 (onl 
X) c2 
a Fil 2 


1 t, Cn 


1 fo 
{X, + t,X2=c; | 1 <i <n}, which can be written as |. 
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ty Cl 

t2 c2 

As we have seen, the solution of minimal norm, if it exists, is : 
1 t Ch 


Otherwise, this is the best approximation to the solution the system. 


With kind permission of the University of Pennsylvania Libraries. 


Irish-born Robert Adrain emigrated to the United States in 1798. He 
published his own mathematics journal, but his work received no in- 
ternational attention at the time. 


Example To find the equation of the line in the Euclidean plane which best fits the 
set of points {(1, 3), (2, 7), (3, 8), (4, 11)}, we calculate 


a Dae 
7/_ (1720 10 0 -10 
8|—\o0|/-6 -2 2 6 
1 


so the line we want is given by {(f, 1 + 31) | t € R}. 


= co YW 
| 
NIle 
i 
nn 
Ls 


1 1 
1 2 
1 3 
1 4 1 1 


We can use the same method to find the best fit of any polynomial of a higher 
degree to a set of points. For example, if we wish to find a parabola which best fits 
the set of points {(¢;, c;) | 1 <i <n} in the Euclidean plane, we have to find a best 
approximation to a solution of the system of linear equations {X; + t;X2+ =1C;-| 


1 ty ty Cl 
‘ : : to re €2 

1 <i <n}, which we know is ° 
1 4 ie Cn 


Example To find the equation of the parabola in the Euclidean plane which best fits 
the set of points {(1, 3), (2, 7), (3, 8), (4, 11)}, we calculate 


Pr Ts) 7 as is 25 asq\ 3 
=([—|-31 23 27 -19 
13 9 8 OG)" ce ce 8 
1 4 16} [11 ll 
1 
1 
=| 15'|, 
4 
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and so the parabola we want is given by {(f, =} + Rt _ a) |t € R}. 


Needless to say, we can also consider a much more general context. Suppose that 
W is a finitely-generated subspace of IR“. Given a set of observations {(t;, ¢;) |1< 
i <n}CA~xR, we want to find the function g € W which best approximates these 
observations. 

To do this, we pick a basis {f1,..., fx} for W. Then we want to find a best 
approximation to a solution of the system of linear equations 


{Xi fii) +--+ Xe felts) =e | 1 <i <n}, 


fit)... Skt) | | X1 a 
which can be written as : om : : |=] : |. As we have seen, 
fin)... felt) Xx Cn 
AG: a0. HOY) a 
this is ee 
Sin) «-. fk Cn) Cn 

Least-squares approximations are often used to find best-fit solutions to very 
large systems of linear equations of the form AX = w which, in theory, have an 
exact solution but in practice that solution cannot be found because of errors in 
measurement of the data and computational errors. Indeed, Gauss developed this 
method for finding solutions to the very large systems of linear equations which 
resulted from laying down a triangulation grid for a geodetic survey of the state of 
Hanover he conducted in 1818. In 1978, the American National Geodetic Survey 
used it to solve a system of over 2.5 million linear equations in 400,000 unknowns 
which resulted from the updating of the triangulation grid for the continental United 
States. 

The constructions presented in this chapter can be generalized considerably. In- 
deed, if (K,e) is an associative unital algebra over a field F on which we have 
defined an involution a +> a*, then an element a of K has a Moore—Penrose pseu- 
doinverse b if and only if the following conditions are satisfied: 

(1) aebea=aandbeaeb=b,; 
(2) (bea)*=beaand (aeb)* =aeb. 

Proposition 19.1 can easily be modified to show that such a pseudoinverse, if it 
exists, is unique. Pseudoinverses of this sort show up in the study of C*-algebras, or, 
more generally, associative unital algebras (K, e) that satisfy the Gelfand—Naimark 
property , namely that e + a* ea is a unit of K for each a € K, where e is the 
multiplicative identity of K. In such algebras, it is possible to show that if a € K 
satisfies the condition that there exists an element b € K satisfying ae bea =a and 
beaeb=b, then a has a Moore—Penrose pseudoinverse. 
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With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Gelfand); With kind permis- 
sion of the American Mathematical Society (Naimark). 
Israil Moisseevich Gelfand was a twentieth- 
century Russian mathematician who emigrated 
to the United States. He worked in many ar- 
eas of analysis and mathematical biology. Mark 
Aronovich Naimark was a _ twentieth-century 
Ukrainian mathematician who worked primarily in 
functional analysis. 


Finally, one should note that the Moore—Penrose pseudoinverse is just one of 
many “pseudoinverses” in the mathematical literature, each designed for a fairly 
specific purpose. The first of these was introduced by Fredholm in 1903 to deal with 
integral operators. Others are based on specific situations which arise in algebra or 
analysis, or which are used to implement specific computational methods. 


Example Let V be a vector space finitely generated over a field F and let 
a € End(V). Let k = inf{0 < h € N | rk(a") = rk(@"*!)}. Then the Drazin pseu- 
doinverse of a is the endomorphism £ of V satisfying a*+!g = ak, Bap = B, 
and af = Ba. If such a B exists, it is necessarily unique. It is immediate that if 
a € Aut(V) then k = 1 and 6 = a~!. If @ is nilpotent then its Drazin pseudoinverse 
is o9. Drazin pseudoinverses have important applications in differential equations 
and in mathematical economics. 


Exercises 


Exercise 1160 


Let V and W be finitely-generated inner product spaces and let a ¢ Hom(V, W). 
Let 6B € Hom(W, V) satisfy aBa = a. Show that rk(8) > rk(@), with equality 
holding if and only if Ba6 = p. 


Exercise 1161 


1 
Let A=] —1 € M3x2(R). Calculate AT. 
2 


WD — = 


Exercise 1162 
Let A=[500] €M1,3(Q). Calculate AT. 


Exercise 1163 
Let A=[a, ... dn] € Mi xn(C), where n is a positive integer. Show that AT = 


(AA#)-1A#, 
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Exercise 1164 


2 2 0 
LetA=|1 2 1 | €M3,3(R). Calculate A*. 
12 1 


Exercise 1165 
Let V and W be finitely-generated inner product spaces, and let a ¢ Hom(V, W). 
For any nonzero scalar c, show that (cw)* = tat, 


Exercise 1166 
Let n be a positive integer and let A € Mn x»(R) be a diagonal matrix. Calcu- 
late A*. 


Exercise 1167 
Let V and W be finitely-generated inner product spaces, and let a € Hom(V, W). 
Show that (a*)* = (a@t)*. 


Exercise 1168 
Let V = R?, which is endowed with the dot product and let a: V > R be the 
a 


linear functional defined by a : b 


t> a. Let 8B: RV be the linear transfor- 


mation defined by B : at> * Show that (aB)t 4 BTar. 


Exercise 1169 
Let n be a positive integer and let A = [ajj] € Mnxn(R) be the matrix all entries 
of which are equal to 1. Show that At =n~7A. 


Exercise 1170 


Let A € Mgxn(R) be a matrix of the form 7 al where C is at x t nonsin- 


gular diagonal matrix. Show that At = E al where D= C™!. 


Exercise 1171 

Let V = R” on which we have the dot product defined, and let w € End(V) satisfy 
the condition that ker(~) = im(a)+. Show that the restriction B of a to im(q@) is 
an automorphism of im(q@) and that the restriction of wat to im(@) equals 6 ae 


Exercise 1172 
Let n be a positive integer and let A, B € Myx» (R) be matrices satisfying the 
conditions ABA = A, BAB = B, and A? =A Isit necessarily true that B?=B? 
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Exercise 1173 

Let h,k,m, and n be positive integers, let A € Myyx(R), let BE Mm xn(R), 
and let C € Mj, (IR). Show that there exists a matrix X € Mx x.(R) satisfying 
AXB=C if and only if AATCBTB=C. 


Exercise 1174 
Let k and n be positive integers and let B€ Mxx,(R) and C € Myxn(R) be 
orthogonal matrices. For A € Mz, (IR), show that (BAC)+ = C7 A+B’. 


Exercise 1175 
Let k and n be positive integers and let A € Mx, (R) and B € Mxgxn(R). Let 
C € Mnxn(R) be nonsingular. Prove that 


A AB]* [At —AtABC7! 
O C} [Lo Cc 


Bilinear Transformations and Forms 2 0 


Let V, W, and Y be vector spaces over a field F. We say that a function f : 
V x W => Y isa bilinear transformation if and only if the function vb f(v, wo) 
belongs to Hom(V, Y) for any given vector wo € W and the function w +> f (vo, w) 
belongs to Hom(W, Y) for any given vector ug € V. The set of all bilinear transfor- 
mations from V x W to Y will be denoted by Bil(V x W,Y). If f,g € Bil(V x 
W,Y) and if c € F then f + g and cf also belong to Bill(V x W,Y), and so 
Bill(V x W, Y) is a subspace of the vector space YY *™ over F. Also, any bilinear 
transformation f : V x W — Y defines a bilinear transformation f°? : Wx V > Y, 
called the opposite transformation of f, by setting f°: (w,v) KH fv, w). It is 
clear that the function 


()°? : Bill(V x W, Y) > Bill(W x V, Y) 


is an isomorphism of vector spaces. We say that a bilinear transformation 
f €Bill(V x V,Y) is symmetric if and only if f = f°?. It is skew symmetric if 
and only if f =— f°. 

In particular, if we consider a single vector space V over a field F,, then we note 
that f € Bill(V x V, V) if and only if the operation e on V defined by ve w = 
Jf (v, w) turns V into an F-algebra. This algebra is commutative if and only if f is 
symmetric. 


Example Let n be a positive integer and let V = R”, on which we have the dot prod- 
uct defined. A classical problem in geometry is to ask if there exists a bilinear trans- 
formation f € Bill(V x V, V) satisfying the condition that || f(v, w)|| = ||v|| - || w| 
for all v, w € V. Euler showed that such a transformation exists for the case n = 4. 
At the end of the nineteenth century, Hurwitz showed that such transformations exist 
only when n = 1, 2,4, or 8. 


J.S. Golan, The Linear Algebra a Beginning Graduate Student Ought to Know, 453 
DOI 10.1007/978-94-007-2636-9_20, © Springer Science+Business Media B.V. 2012 
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With kind permission of ETH-Bibliothek Zurich, Image Archive. 


Adolph Hurwitz was a nineteenth-century German mathematician 
who taught both Hilbert and Minkowski. 


Example Let F bea field and let k, n, and t be positive integers. Set V = Myyn(F), 
W=Mixn(F), and Y = My y+(F). Then there exists a bilinear transformation 
V x W = ¥ defined by (A, B) + AB’. In particular, we have a bilinear transfor- 
mation F” x F”? > My yn(F) given by (v, w) H uA w. More generally, if V, W, 
and Y are as mentioned, every matrix C € My x,(F) defines a bilinear transforma- 
tion V x W => ¥ by setting (A, B) ACB’. 


Example For vector spaces V and W over a field F,, the function Hom(V, W) x 
V — W given by (a, v) + a@(v) is a bilinear transformation. 


Let V, W, and Y be vector spaces over a field F’. The image of a bilinear trans- 
formation f € Bill(V x W, Y) is not necessarily a subspace of Y, as the following 
example shows. 


Example Consider the bilinear transformation f : R? x R? > Mo2,2(R) defined 
0 


, but is not a 
1 O 


by f:(v,w)t vA w. The image of f contains E 0| and 
: 0 1 . 
subspace since 1 ] ¢gim(f). 


As with linear transformations, bilinear transformations are totally determined 
by their behavior on bases. That is to say, let V and W be vector spaces over a 
field F, and let B = {u; | i € Q} and D= {w; | j € A} be bases of V and W, 
respectively. Let Y be a vector space over F and let fo: Bx D— Y bea 
function. Then there exists a unique bilinear transformation f € Bill(V x W, Y) 
satisfying f(v;,w;) = fo(vi,w;) for all i and j, namely the function defined 
by f: Wieg aitis LA bjwi)) Vieg pare: ajb; fo(vi, w;). In the case that 
V=W=Y, we have already noted this fact in Proposition 5.5. 


Proposition 20.1 If V, W, and Y are vector spaces over a field F, then 
Bill(V x W, Y) is isomorphic to Hom(V, Hom(W, Y)). 


Proof Define a function 0: Bill(V x W, Y) ~ Hom(V, Hom(W, Y)) as follows: 
given a bilinear transformation f € Bill(V x W,Y) and a vector v € V, then 
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O(f)@~) : wr f(v,w). It is straightforward to check that indeed 0(f)(v) € 
Hom(W, Y) for all f € Bill(V x W, Y) and all v € V. Moreover, 0(f)(v1 + v2) = 
O(f)(v1) + 8(f)(v2) and @(f)(cv) = c6(f)(v) for all v, v1, v2 € V and all ce F, 
so 6(f) € Hom(V, Hom(W, Y)) for all f € Bill(V x W, Y). Finally, 0(f + g) = 
0(f) + 0(g) and O(cf) = cO(f) for all f, g € Bill(V x W, Y) and all c € F, and so 
we have shown that @ is a linear transformation. 

It is also possible to define a function 


gy: Hom(V, Hom(W, Y)) — Bill(V x W, Y) 


by setting g(a): (v, w)  a(v)(w) for all v € V and w € W, and again it is easy 
to show that this is a linear transformation. If a € Hom(V, Hom(W, Y)) andve V, 
then 0g(a)(v): wr g(a)(v)(w) = a(v)(w) and so 9g(a)(v) = a(v) forallue V. 
Thus 0g(a@) =a for all a e Hom(V, Hom(W, Y)), and so 09 is the identity function 
on Hom(V, Hom(W, Y)). Conversely, if f ¢ Bill(V x W, Y) then 


pO(f):(v, w) Fs O(f)(V)(w) = flv, w) 


for all v € V and w € W and so g@(f) = f for all f € Bill(V x W, Y), proving 
that @@ is the identity function on Bill(V x W, Y). Thus we have established that 0 
is an isomorphism, with 6~! = 9. 


Let V and W be vector spaces over a field F. A bilinear transformation 
f:V x W > F iscalled a bilinear form. We will denote the set of all such bilinear 
forms by Bill(V x W), instead of Bill(V x W, F). By what we have seen above, 
Bill(V x W) is a subspace of FY*W which is isomorphic to Hom(V, D(W)). If 
V and W are vector spaces over a field F, then a bilinear form f € Bill(V x W) 
is nondegenerate if and only if for each Oy # v € V there exists a w € W sat- 
isfying f(v,w) #0 and for each Ow 4 w € W there exists a v € V satisfying 


f(v, w) £0. 


With kind permission of the Archives of the Mathematisches Forschungsinstitut Ober- 
wolfach. 

Mathematicians at the beginning of the nineteenth century, such as 
Gauss and Jacobi, preferred to state their results in terms of bilinear 
forms rather than in terms of matrices. Sylvester contributed greatly to 
the theory of bilinear forms, as did the influential nineteenth-century 
German mathematician Karl Weierstrass. 


Example If V is an inner product space over R, then the function (v, w)  (v, w) 
belongs to Bill(V x V). This is not true, of course, if our field of scalars is C. 


Example If F is a field and V = F” for some positive integer n, then the function 
(v,w)t> v© w belongs to Bill(F” x F”). This function is particularly useful in 
the case F = GF(2). Indeed, if v € GF(2)”, then 
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0 if an even number of entries in v are equal to 1, 


a 1 if an odd number of entries in v are equal to 1. 


This value is known as the parity of v. 
More generally, let A be a finite set and let V be the collection of all subsets of A. 
Define a function f : V x V — GF(2) by setting 


Ate OQ if AM B has an even number of elements, 
fA, “| 1. if AM B has an odd number of elements. 
Then f € Bill(V x V). 


Example If V is a vector space over a field F, we have a nondegenerate bilinear 
form in Bill(D(V) x V) given by (6, v) b d(v). Similarly, if 61,52 € D(V), we 
have a bilinear form in Bill(V x V) given by (v, w) > 6;(v)d2(w), which is non- 
degenerate if 5; and 52 are not the 0-functional. 


Example If F is a field and if k and n are positive integers, then each matrix 
A € Mkxn(F) defines a bilinear form in Bill(F* x F”) by (v, ww) v© Aw. 


Example If F is a field of characteristic not equal to 2 and of V is a vector space 
over F, then any f € Bill(V x V) can be written as a sum of a symmetric bilinear 
form and a skew-symmetric bilinear form, namely f = f+ fo, where f) : (v, w) be 
si f(v, w)+ f(w, v)] and fy: (v, w) > 5[f(v, w) — f(w, v)]. Moreover, this rep- 
resentation is unique, for if f = g; + g2, where g; is symmetric and g is skew 
symmetric, then for each (v, w) € V x V we have f(v, w) + f(w, v) = 2g1(v, w) 
and f(v, w)— f(w, v) = 2g2(v, w), from which we deduce that gj = f; fori = 1, 2. 


If V and W are vector spaces over F of finite dimension k and n, respectively, 
then any bilinear form on V x W can be represented as in the previous example. In- 
deed, if we fix bases B = {v1,..., vz} for V and D= {w,..., wn} for W, then for 
any f € Bill(V x W) we define the matrix Tgp(f) =[f (vi, wj)] € Mkxn(F) and 
check that if v = oe ajv; and w = ei bjwj, then f(v, w) =v © Tgp(f)w. 
Indeed, for fixed B and D, the function f + Tgp(f) is an isomorphism from 
Bill(V x W) to Mixn(F). 


Example Let F be a field and let V = F”. Consider the bases B = {fo ; Hi 


and D= {| | ‘ 1: || of V.If f € Bill(V x V) is given by f : (Hi F 1‘) a 


(a + b)(c +d), then it is easy to verify that Tgp(f) = F | and Tpp(f) = 


[o 4) 
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Proposition 20.2 Let V and W be vector spaces finitely generated over a 
field F and having bases B = {v1,..., vx} and D = {w1,..., Wn}, respec- 
tively. Let C = {x1,...,xx}and E = {y,..., yn} also be bases for V and W, 
respectively, and let P = [pir] © Mxxx(F) and Q = [qjs] © Mnxn(F) be 
nonsingular matrices satisfying 


=P! : and : |=Q 


Xk Uk Yn Wn 


Then for f € Bill(V x W) we see that Tcp(f) = PTgp(f)Q°. 


Proof As a direct consequence of the definitions, we see that 


k n kon 
f (xi, yj) = » Pir vr, Sai] = = Pir f (vr, Ws )qjs> 
1 s=1 


T= K r=l1 s=1 


and this is precisely the (i, j)th-entry of PTgp(f)Q’. 


In particular, we see that if f € Bill(V x V), where V is a vector space of fi- 
nite dimension n over a field F’, and if B and D are bases of V, then there exists a 
nonsingular matrix P € Myxn(F) satisfying Tpp(f) = PTpp(f)P?. In general, 
matrices A and C in My x»(F) are congruent if and only if there exists a nonsin- 
gular matrix P € M,x,(F) satisfying C = PAP’. Congruence is easily checked 
to be an equivalence relation on M,x»(F), which joins the relations of equivalence 
and similarity, that we have already defined. Congruent matrices clearly have the 
same rank, so that the rank of a matrix of the form Tgg(f) depends only on f and 
not on the choice of basis B. Therefore, we call this the rank of the bilinear form f. 
Thus, for example, the bilinear forms in Bill(V x V) of rank | are precisely those 
of the form (v, w)  a(v)B(w), where a, B € D(V). 

A matrix congruent to a symmetric matrix is again symmetric. Indeed, if 
AE Mnhxn(F) is symmetric, then for any nonsingular matrix P we have 
(PAP!)? = PTT AT pT = PAP?. 


1 -6 -6 
Example The matrix A=|—6 40 39 | € M3,3(R) is congruent to J, since 
39 «39 
0 0 
PAP! =I, where P=| 3 5 0 
ee 


As was the case with inner products, we can define orthogonality with respect 
to an arbitrary bilinear form. This concept has important applications when we are 
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working over fields other than R or C, and especially in areas such as algebraic 
coding theory, where all of the work is done over finite fields. Let V be a vector 
space over a field F and let f € Bill(V x V). Vectors v, w € V are f-orthogonal if 
and only if f(v, w) = 0. In this case, we will write v | ¢ w. (One has to be careful 
here, it may be true that v | ¢ w but false that w Ly v; this will not happen, of 
course, if f is symmetric.) If A is anonempty subset of V, then we can talk about the 
right _f -orthogonal complement of A to be the set At’ ={weV |v ty w for all 
v € A}. Complements of this form may behave very differently than complements 
defined by inner products, as the following example shows. 


Example Let F = GF(2) and let V = F*. Define f € Bill(V x V) by setting 


0 1 1 0 
f(v, w) =v©w. Then W = : ; ; : : ; is a subspace of V 
0 1 0 1 


which satisfies Wtf = W. 

We note that V+ is trivial if and only if for any Oy # w € V there exists a 
vector vu € V satisfying f(v, w) 40. This condition is not a consequence of our 
definitions, and we must explicitly state it when we need it. It holds, of course, if f 
is nondegenerate. 


1 1 
Example Let V=R i ; C R*. If f € Bill(V x V) is defined by 
1 —1 
ay by 0 
f: se ; Ke +> aby +a2b2 +.43b3 —aaba, then : ¢ V+/ and, indeed, 
3 3 
a4 b4 1 
1 0 
—l 0 
tre 
a i O};’] 1 
0 1 


Proposition 20.3 Let V be a vector space over a field F, and let f € 
Bill(V x V). If A is a nonempty subset of V then: 

(1) A-S is a subspace of V; 

(2) Atl =(FA)*S; 

(3) If ACB then Btf C Arr, 

Moreover, if {A; | i € Q} is a collection of nonempty subsets of V, then 


L 
(Vier Aj)"! = Nica A; - 
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Proof The proof of (1)—(3) is an immediate consequence of the definitions. To prove 
the last statement, we note that if w € V, then w € (LU; EQ Aj)+¥ if and only if for 


eachi € Q andeach v € A; we have f(v, w) = 0. This is true if and only if w € Avs 


for each i € Q, namely if and only if w € (Jc ee. 


Proposition 20.4 Let V be a vector space finitely generated over a field F 
and let f € Bill(V x V) satisfy the condition that V+ is trivial. Then each 
subspace W of V satisfies the following conditions: 

(1) If6 € DW) there exists av € Vsuch that 6(w) = f (v, w) forall w € W; 
(2) dim(W) + dim(W+/) = dim(V). 


Proof (1) Every vector v € V defines a linear functional 5, € D(V) by setting 
dy: yt> f(y, v). Moreover, the function v +> 6, from V to D(V) is a linear trans- 
formation, which is a monomorphism as a result of the condition that V+ is trivial. 
But dim(V) = dim(D(V)) since V is finitely generated, and hence this is an iso- 
morphism. Now let 5 € D(W) and let Y be a complement of W in V. Then the 
function from V to F given by w+ yt» 6(w) belongs to D(V) and so there exists 
a vector v € V such that it equals 6,. In particular, 6(w) = f(v, w) for all w € W, 
proving (1). 

(2) The function from V to D(W) which assigns to each v € V the restriction of 
dy to W is a linear transformation which, by (1), is an epimorphism. The kernel of 
this epimorphism consists of all vectors v € V satisfying f(w, v) =0 forall w € W, 
and that is precisely W+/. Therefore, by Proposition 6.10, we have (2). 


In particular, we see from Proposition 20.4 that a necessary and sufficient condi- 
tion for us to have V = W @ W~YS is that W and WS be disjoint. 


Proposition 20.5 Let V and W be vector spaces finitely generated over a 
field F and let f € Bill(V x W) be a bilinear form which is not the 0-function. 
Then there exist bases {v,,..., vx} and {w1,..., Wn} of V and W, respec- 
tively, and there exists a positive integer 1 < t < min{k, n} such that 


1 fi=)j<t, 
0 otherwise. 


fou) ={ 


Proof Since f is not the 0-function, there exist vectors v; € V and y; € W such that 
Ff (v1, y1) #0. Therefore, if we set w1 = f (v1, yi)! y, we have f (v1, w1) = 1. Let 
VY, = Fv, and W, = Fw .If we set Wo ={w EW | f(v1, w) = O},7 then WN W2 = 
{Ow} since cw, ¢ Wo for allO #c € F. We claim that W = W; ® W%. Indeed, if 
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w € W and if c= f (v1, w) then we see that 


f(y, w—cw 1) = f(u1, w) —cf(v1, w1) =c —c=0 


and so w — cw, € W2, which proves the claim. In a similar way, we have 
V=V, ® V2, where V2 = {uv € V | f(v, wi) = O}. Thus we see that f(v, w) = 0 
whenever (v, w) € [V, x W2] U[V2 x Wj]. 

By passing to the oppose form if necessary, we can assume without loss of gener- 
ality that k <n.Ifk = 1, we choose {v1} as a basis for V and {w1,..., w,} as a basis 
for W, where {w2,..., wy} is an arbitrary basis for W2. This proves the proposition, 
with tf = 1. Now assume that k > 1 (which implies n > 1) and that the proposition 
has been proven whenever dim(V) < k. In particular, we will look at the restriction 


of f to V2 x W2. By the induction hypothesis, there exist bases {v2,..., vx} of V2 
and {w2,..., Wn} of W2 such that 
1 if2<i=j <t, 
fWi, w= to otherwise. 


Then {v1,..., vg} and {w 1, ..., wy} are the bases we want. 


We see that if V and W are vector spaces finitely generated over a field F and if 
f € Bill(V x W), then Proposition 20.5 says that there exist bases of V and W with 
I O 
O OF; 

We will be particularly interested in symmetric bilinear forms. As an immediate 
consequence of the definition, we see that if V is a vector space finitely generated 
over a field F and if B is a given basis for V, then a bilinear form f € Bill(V x V) 
is symmetric if and only if the matrix Tgg(f) is symmetric. Moreover, every sym- 
metric matrix is Tgg(f) for some symmetric bilinear form f € Bill(V x V). 


respect to which f is represented by a matrix of the form 


1 -5 3 
Example Let B be the canonical basis of R? and let A = | —5 1 7). Then 

3 7 4 
A=Tpp(f), where f € Bill(R* x R*) is defined by 


a by 
fl | a]. |b = a,b, + agb2 — 5(a,b2 + azb)) 
a3 b3 


+ 3(a1b3 + a3b1) + 7(a2b3 + a3b2) + 4a3b3. 


Proposition 20.6 Let F be a set of characteristic other than 2 and let V be 
a vector space finitely-generated over F. Let f € Bill(V x V) be symmet- 
ric. Then there exists a basis B = {v,,..., Un} of V such that Tpp(f) is a 
diagonal matrix. 
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Proof The proposition is trivially true if f is the 0-function, and so we can assume 
that is not the case. We will proceed by induction on n = dim(V). For n = 1, the 
result is again immediate, and so we can assume that n > | and that the result has 
been established for all spaces having dimension less than n. We first claim is that 
there exists a vector uv € V satisfying f(v, v) 4 0. Indeed, assume that this is not the 
case. Then if v and w are arbitrary vectors in V we have 


O= fvtu,v+w)= f(v,v)+2f,w)+ flu, w) =2f(v, w) 


and since the characteristic of F is not 2, this implies that f(v, w) = 0, contradicting 
our assumption that f is not the 0-function. Hence we can select a vector vj € V 
satisfying f (v1, v1) 40. 

Let Vj = Fv, and let V2 = Vi f From the definition of V, it is clear that Vj 
and V) are disjoint, and from Proposition 20.3 it follows that V = V; ® Vo. In 
particular, dim(V2) = n — 1 and so, by the induction hypothesis, there exists a 
basis C = {v2,..., Un} of V2, such that, if fo is the restriction of f to V2, then 
Tcc(f2) is a diagonal matrix. Since f(v1, v;) = 0 for all 2 <i <n, it follows that 
B={v1,..., Un} does indeed give us the desired result. 


Thus we see that every symmetric matrix over a field of characteristic other than 
2 is congruent to a diagonal matrix. 


Proposition 20.7 Let V be a vector space finitely-generated over C and let 
f €Bill(V x V) be a symmetric bilinear form of rank r. Then there exists a 


basis B = {v1,..., Un} of V satisfying the following conditions: 
(1) Tgp(f) is a diagonal matrix; 

1 ifl<i<r, 
2 FO w= i otherwise. 


Proof By Proposition 20.6, we know that there is a basis B = {v1,..., Un} of V 
satisfying the condition that Tp z(/) is a diagonal matrix. This matrix is of rank r 
and so, renumbering the basis elements if necessary, we can assume that f(v;, vj) A 
0 when and only when | <i <r. Foreach | <i <r, define c; = f(v;, vj) i EC, 
and replace v; by cjv; to get a basis satisfying (2) as well. 


Let V be a vector space finitely-generated over a field F' of characteristic other 
than 2 and let f € Bill(V x V) be a bilinear form. The function g : V > F de- 
fined by g: ut f(v, v) is called the quadratic form defined by f. Note that if 
a € F and v € V then q(av) = f(av, av) = a’ f(v, v) = a*q(v). Moreover, if 
f €Bill(V x V) and if g € Bill(V x V) is the symmetric bilinear form g : (v, w) 
sf (v, w) + f(w, v)] then the quadratic forms defined by f and g are the same. 
Therefore, without loss of generality, we will always assume that all quadratic forms 
over such fields are defined by symmetric bilinear forms. We further see that dif- 
ferent symmetric bilinear forms define different quadratic forms, since, for any 
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v, w € V, we have f(v, w) = ACO + w) — q(v) — q(w)]. The classification of 
quadratic forms is of great importance in analytic geometry and in number theory. 


With kind permission of the Archives of the Mathematisches 
Forschungsinstitut Oberwolfach (Witt). 

The theory of quadratic forms over IR was developed 
by Gauss and his student Eisenstein, and the need 
to study such forms was one of the factors which 
led to the development of determinant theory. Their 
work was extended to quadratic forms over C by 
the nineteenth-century British mathematician Henry 
Smith. The fundamental development in the theory of symmetric bilinear forms on vec- 
tor spaces over fields of characteristic other than 2 is due to the twentieth-century German 
mathematician Ernst Witt. 


Let V be a vector space over R. A quadratic form g : V > R is positive if and 
only if g(v) > 0 for all Oy Ave V. If g: V > Ris a positive quadratic form de- 
fined by a symmetric bilinear form f € Bill(V x V), then f must be nondegenerate. 
Indeed, if Oy Av € V then f(v, v) 40. 


Example If V is the vector space of all polynomial functions from R to itself, 
then we have a symmetric bilinear form from V x V to R defined by (f, g) & 
i Jf @®g(t) dt, which in turn defines the positive quadratic form f t> . f(t)? dt. 


Example Let V = R* and let f € Bill(V x V) be the symmetric bilinear form 


at by 
a2 bo : . 
Reais ip t> ayby + agb2 + a3b3 — agb4, which lies at the center of 
3 3 
a4 ba 
Minkowski’s mathematical formulation of Einstein’s relativity theory. The quadratic 
ay 
form defined by this bilinear form is ie a ay + as + ax = aoe A similar symmet- 
3 2 
a4 
at bj 
ric bilinear form is the Lorentz form ve , t> ayby + anb2 + a3b3 — 
3 3 
a4 b4 
c?a4b4, where c is the speed of light. The quadratic form defined by this bilinear 
a 
a 2.2 


form is b> aj tax t+az—c ay. 


a4 
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With kind permission of the Museum Boerhaave Leiden. 


The Dutch physicist Hendrick Antoon Lorentz, the first to conceive 
of the notion of the electron, won a Nobel prize in 1902. His work 
formed a basis for much of Einstein’s theory. 


A more general result, also based on the work of Lorentz and Minkowski, gives 
a fascinating “reversal” of the Cauchy—Schwarz—Bunyakovski inequality. Let n be a 
positive integer and consider the subset (not subspace) U of R”*! consisting of all 


a : . ; 
vectors of the form a where a is a nonnegative real number and v € R” satisfies 


|v|| <a. Foru = A and y = * | in U, let us define u LJ y to be ab — v- w. By 


our assumption on U, we note at u LE] u > 0 for every u € U. Then one can show 
thatu Hy > [VuLlul[./yl y]. This inequality is often known as the lightcone 
inequality because of its applications in physics. 


Example If V is an inner product space over R, then we have already noted that 
the function f : (v, w) b> (v, w) is a symmetric bilinear form. The quadratic form 
defined by f is given by v + ||v||*. This quadratic form is surely positive. The 
converse is also true. If f € Bill(V x V) is a symmetric bilinear form defining a 
positive quadratic form, then f is an inner product on V, in the sense of Chap. 15. 


By Proposition 20.7, we see that if V is a vector space finitely generated over C 
andif f € Bill(V x V) is symmetric and has rank r, we can find a basis {v1,..., Un} 
of V such that the quadratic form g defined by f is given by g: )77_, ajuj Bb 


visi a}. 


Example Let F be either R or C. Let n be a positive integer and let A € Mnxn(F) 
be symmetric. Let f € Bill(F” x F”) be the symmetric bilinear form given by 
f :(v,w) =v! Aw, and let q be the quadratic form defined by f. The set {q(v) | 
||v|| = 1} (here the norm is the one defined by the dot product on F”) is called the 
numerical range of the matrix A. In the case F = C, this is always a bounded convex 
subset which contains all of the eigenvalues of A. For the special case n = 2, this 
set is an ellipse with its foci at the eigenvalues of A, assuming that they are distinct, 
or a circle with center at the sole eigenvalue of A, assuming that A has only one 
eigenvalue of multiplicity 2. For n > 2, the characterization of the numerical range 
is much more complicated. 
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Proposition 20.8 Let n be a positive integer and let A € My xn(R) be sym- 
metric. Let f € Bill(R” x R”) be the symmetric bilinear form given by 
f:(v,w) v! Aw, and let q be the quadratic form defined by f. Let 
Cc, = c2 > +++ > Cy be the eigenvalues of A. Then the numerical range of A 
lies in the closed interval [cn, c,]. Moreover, both endpoints of this interval 
belong to the numerical range of A. 


Proof By Proposition 17.7, we know that there exists an orthonormal basis B = 
{v1,...,Un} of V consisting of eigenvectors of A. Moreover, if v € V then v = 
7, (v, v;)v; by Proposition 17.9, and so 1 = |ju|? = (v, v) = 77, (v, vi). We 
also see that Av = )-7_, (vu, v;) A(u;) = 77, ci (v, vj) v;. Thus v? Av = (v, Av) = 
Meer, uy). But = Oh Wa) = yy G ea)? > ey ye) 
= C,. Therefore, the numerical range of A lies in the closed interval [c,,, c1]. 

If v is a normal eigenvector of A corresponding to cy, then v’ Av = (v, Av) = 
(UV, CnV) = Cn(v, V) = Cp, and similarly for the case of an eigenvector of A satisfying 
||v|| = 1 and corresponding to c}. 


In order to see the geometric significance of quadratic forms, let us recall that a 
general quadratic equation in three unknowns over R is one of the form 
(a1. X} +ax2X5 + a33X3) + 2(aj2X 1X2 + a13X1X3 + 423X2X3) 
+b,X,+b2X2+b3X3+c=0 
in which not all of the aj; are equal to 0. Such an equation can be writ- 


ten in the form f(v,v) + w-v+c=0, where f € Bill(R*, R*) is the sym- 
metric bilinear form defined with respect to the canonical basis by the matrix 


a1 a2 413 by XxX 
A= |aj2 42 423 |, where w= | b2 |, and where v= | X2 |. The graph of 
a3 423° «422 b3 X3 


such an equation is a quadratic surface. The various quadratic surfaces in R? can 
then be classified by considering congruence classes of the matrices A, a task very 
important in analytic geometry. 

We will now return to the general case of bilinear transformations. Let F be a 
field, let V and W be vector spaces over F,, and let G = FY*W) Then G is a 
subspace of prev having a basis {gy,w | (v, w) € V x W}, where 


fens 1 if(@’,w)=(,w), 
Su,w i (Vw) > {0 otherwise. 
Let H be the subspace of G generated by all functions of the form 


Svj+v2,w — Svj,w — Sv2,w» 8v,wi+w2 — §v,w; — Sv,w2» Sav,w — 48v,w; 


Or = 8v,aw — 48v,w 
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for all v, vj, v2 € V, w, wi, W2 € W, anda e F. Let us pick a complement of H 
in G, and call it V @ W. By Proposition 7.8, we know that V @ W is unique up 
to isomorphism. Let a be the projection of G with image V @ W coming from the 
decomposition G = H ® (V ® W) and, for all v € V and w € W, denote a(gy.y) 
by v @ w. Then B= {v @ w | (v, w) € V x W} is a generating set for V @ W. It 
is important to emphasize that the elements of V @ W are linear combinations of 
elements of B. In quantum physics, elements of V @ W ~ B, for suitable spaces V 
and W, are known as entangled tensors and these have important physical interpre- 
tations. Elements of B are known as simple tensors. 
If vj, v2 € V and w € W, then 


[vi + v2] @w —(v] @w)— (v2 @w) = O(20) 40.w — 8v,,w — 8un,.w) = 0G 


and so [vy + v2] ® w = (vj ® w) + (v2 @ w). Similarly, if ve V and wi, w2 € W 
then v @[w; + w2] =v @w, +v@ w2. We also see thatifue V,w Ee Wandce F, 
then cv @ w =c(v ® w) =v @ cw. The vector space V ® W is called the tensor 
product of V and W. 


With kind permission of the 
Archives of the Mathematisches 
Forschungsinstitut | Oberwolfach 
(Chevalley). 

There are many equiva- 
lent definitions of the ten- 
sor product. The definition 
’ given here is due to the 
Peenneicen ti French mathematician Claude Chevalley. The notion of a tensor was 
first introduced in differential calculus by the nineteenth-century Italian mathematicians 
Gregorio Ricci-Curbastro and Tullio Levi-Civita and became a central tool in relativity 
theory. 


From the definition of the tensor product, we see that the function tyw from 
V x Wto V @ W given by (vu, w) + v®@ w is a bilinear transformation. This trans- 
formation has a very special significance, due to the following theorem, which al- 
lows us to move from bilinear transformations to linear transformations. 


Proposition 20.9 Let V, W, and Y be vector spaces over a field F. For each 
bilinear transformation f € Bill(V x W, Y) there exists a unique linear trans- 
formation a € Hom(V ® W, Y) satisfying f =atyw. 


Proof Given f € Bill(V x W, Y), there exists a linear transformation 6 ¢ Hom(G, Y) 
defined on the elements of a basis of the space G defined above, given by the con- 
dition that 6: gy, yt > f(v, w). Since f is a bilinear transformation, H C ker(6) 
and so we can define the linear transformation a € Hom(V @ W,Y) by setting 
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a: >", aj[v; ® wi] O'_, ai f (v;, wi). This function is well-defined since if 
i aly; ® wi] = 7, bilv; ® wi] in V @ W then 77_, (a; — bi) 8v;,w; € HC 
ker(8). Therefore, we see that a()-/_, aj[v; @ wi]) = a()-/_, bilv; @ wi]). Clearly, 
a is a linear transformation and satisfies f = atyw. 

We are left to prove uniqueness. Suppose that y ¢ Hom(V ®@ W, Y) satisfies 
f =ytvw. In particular, a(v ® w) = y(v @ w) for all (v, w) € V x W. That is to 
say, a and y act identically on a generating set for V @ W and so, in particular, on 
a basis for V ® W contained in this generating set. Therefore, by Proposition 6.2, it 
follows thata = y. 


The following proposition is very important, and is often used as a basis for the 
definition of the tensor product. 


Proposition 20.10 Jf V, W, and Y are vector spaces over a field F , then the 
vector spaces Hom(V ® W, Y) and Hom(V, Hom(W, Y)) are isomorphic. 


Proof The function Hom(V ® W, Y) > Bill(V x W, Y) defined by Br Brtyw is 
clearly a linear transformation, and from Proposition 20.8 it follows that this is an 
isomorphism. Therefore, the result follows from Proposition 20.1. 


Example Let V and W be vector spaces over a field F and let 6; © D(V) and 
62 € D(W) be linear functionals. Then there exists a bilinear form in Bill(V x W) 
defined by (v, w) b> 41(v)d2(w). From Proposition 20.9, it follows that there exists 
a linear functional 6; ® 62 € D(V ® W) satisfying 5; © 59: aa aj[v; ® wi] Pe 
i=) 481 (0; )52 (wi). 


Example More generally, let V and W be vector spaces over a field F, let a be 
an endomorphism of V, and let 6 be an endomorphism of W. The function from 
V x W to V @ W defined by 


(v, w)  a(v) ® B(w) 


is a bilinear transformation and so defines an endomorphism a ®@ B of V @ W satis- 
fying a @ B:>-Y_, aj[vi @ wi] _, aila(vj) ® B(wi)I. 


By Proposition 5.13, we know that if V @ W is a vector space finitely-generated 
over a field F, then dim(V @ W) = dim(V) + dim(W). We now prove the “multi- 
plicative” analog of this assertion for tensor products. 


Proposition 20.11 Let V and W be vector spaces finitely generated over 
a field F. Then V ® W is also finitely generated, and dim(V ® W) = 
dim(V) dim(W). 
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Proof Let us choose bases {vj,..., ug} of V and {wy,...,w,} of W. Then 
for any v = yar aju; € V and w= Vii bjw; € W, we see that v@ w= 
ae vin ajb;(v; ® w;). Thus we see that {vj ® w; | 1 <i<kand1 <j <n}is 
a generating set for V @ W, showing that V @ W is finitely-generated. Moreover, by 
Proposition 20.10 and Proposition 14.8, we see that the dimension of V @ W is equal 
to the dimension of D(V ® W) and hence to the dimension of Hom(V, D(W)), and 
this is equal to the dimension of Hom(V, W), which is precisely dim(V) dim(W). 


In particular, we see that in the context of Proposition 20.10, the set {v; ® w; 
1<i<kand1 <j <n} isin facta basis of V @ W. 


Example Let F be a field and let k and n be positive integers. Then, by Proposi- 
tion 20.11, we know that dim(F* @ F”) = kn = dim(Mxx(F)), and so the vec- 
tor spaces F k @ F” and M«kxn(F) are isomorphic. Indeed, if we choose bases 
{v1,..., Ue} of V and {wj,..., Wn} of W, then the function vj @ wj > uj A w; 
extends to an isomorphism between these two spaces. 


Example Let F be a field, let n be a positive integer, and let V be a vector 
space finitely generated over F and having a basis {v1,..., vg}. The dimension 
of the vector space Mnxn(V) over F is n*k. Consider the bilinear transforma- 
tion f : Maxn(P) x V > Maxn(V) defined by ((ajj],v) > [aijv]. By Proposi- 
tion 20.9, we know that this bilinear transformation defines a linear transformation 
a: Maxn(F) ® V > Mpyxn(V) and it is clear that this is an epimorphism. But, by 
Proposition 20.10, we see that the dimension of My »(F) ® V is also equal to n°k 
and so a must be an isomorphism. 


Example Let F be a field and let k, n, s, and t be positive integers. Let f : 
Mexn Pl) X Msx¢(F) > MisxneCF) be the function defined by 


ayjB... ayB 
f:(A, BR : 
aiB... AnB 


This is a bilinear transformation of vector spaces over F and so, by Proposition 20.8, 
it defines a linear transformation a: Miyn(F) ® Msx1(F) > Misxnt(F) which, 
again, can be shown to be an isomorphism. In the literature, it is usual to write A ® B 
instead of f(A, B). This matrix is called the Kronecker product of the matrices A 
and B. Kronecker products are very important in matrix theory and its applications. 
It is easy to see that for all such matrices A and B we have (A ® B)’ =A? @B?. 
Moreover, if k =n and s =¢ and if A and B are nonsingular, then A ® B is nonsin- 
gular, and (A ® B)~'! = A~! @ B™!. We also note that if A and B are symmetric 
then so is A ® B. Furthermore, Cholesky or QR-factorizations of A ® B come im- 
mediately from the corresponding factorizations of A and B. 
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As an example of the use of Kronecker products, we note the following re- 
sult, established in the 1970s by the American mathematicians Michael Gauger 
and Christopher Byrnes: Let F be a field, let n be a positive integer, and let 
A, BE Mnaxn(F). Let I be the multiplicative identity of My x,(F). Then the ma- 
trices A and B are similar if and only if they have the same characteristic polynomial 
and the n2 x n* matrices AQ@I1—1Q@A,BQI—-1@B,andAQI—-1@Ball 
have the same rank. 

Because of the utility of Kronecker products, one can raise the following prob- 
lem: Given positive integers k, n, s, and t, and given C € Mxsxn;(R), find matri- 
ces A € Mexn(R) and B € Ms x;(R) such that ||C — A ®@ B|| is minimal. Sev- 
eral algorithms have been developed for finding a solution to this problem, the 
first by the American computer scientists Charles Van Loan and Nikos Pitsia- 
nis. 

Let V, V’, W, and W’ be vector spaces over a field F. If a € Hom(V, V’) and 
B € Hom(W, W’) then we have a bilinear transformation V x W > V’ @ W’ de- 
fined by (v, w)  a(v) ® B(w) and so, by Proposition 20.8, there exists a linear 
transformation from V @ W to V’ @ W’ satisfying v @ wb a(v) ® B(w). We will 
denote this linear transformation by a ® B. 


Proposition 20.12 Let V, V’, W, and W’ be vector spaces finitely generated 
over a field F. Any element of the space Hom(V ® W, V’ ® W’) is of the 
form >~_, a ® Bj, where a; € Hom(V, V’) and B; € Hom(W, W’) for each 
1<i<n. 


Proof The function (a, B) +> a ® B from Hom(V, V’) x Hom(W, W’) to Hom(V ® 
W, V’ @ W’) is bilinear and so defines a linear transformation g : Hom(V, V’) @ 
Hom(W, W’) > Hom(V @ W, V’ @ W’). We are done if we can show that ¢ is an 
isomorphism. By Propositions 8.1 and 20.11, we know that 


dim(Hom(V, V’) ® Hom(W, W’)) = dim(Hom(V, V’)) dim(Hom(W, W’)) 
= dim(V) dim(V’) dim(W) dim(W’) 
= dim(V @ W) dim(V’ @ W’) 
= dim(Hom(V @ W, V’@ W’)), 


and so it suffices to prove that g is a monomorphism. 

Indeed, assume that i a; ® Bj € ker(g), where the set {f1,..., By} is lin- 
early independent, and where none of the a; is the 0-function. Then ei aj (v) ® 
Bi(w) = Oy @w’ for all v € V and all w € W. Pick v € V satisfying aj(v) 4 Oy. 
By renumbering if necessary, we can assume that {a1(v),...,a@x(v)} is a maximal 
linearly-independent subset of {a1(v),...,@,(v)}. Therefore, for each k <h <n 
there exists a scalar by;, not all of them being equal to 0, such that a,(v) = 


ae bnjaj(v) and so 
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k m k 
Ovew => ai(v) @fi(w)+ D> (3o112:) ® Bn(w) 


h=k+1 \j=1 


i=l 

k k m 

Yai (v) ® Bi(w) + Y aj (v) ® ( > bes) 
i=l j=l h=k+1 

k m 

Y-ai(v) ® ( (w+ >> iB) 

i=l 


h=k+1 


Since the set {a@;(v),...,a@(v)} is linearly independent, we must have 6;(w) + 
hake Pnj Bn(w) = Ow forall 1 <i <k andall w € W. Hence Bj + Yop 444 bnj Bh 
is the O-function for all 1 <i <k, contradicting the assumption that the set 
{B1,.--, Bn} is linearly independent. We therefore conclude that ker(¢) is trivial, 
which is what we needed to prove. 


Proposition 20.13 [f U, V, and W are vector spaces over a field F , then 


U®VEW)=UBV)@OW. 


Proof The bilinear transformation U x (V @W) > (U@V)®W defined by (u, v® 
w) > (u@v)®@w induces a linear transformation a: U@(V@®W) > (U@V)@W 
which satisfies u @ (v@ w) (u@ v) @ w. Similarly, we have a linear transforma- 
tion B: (U@®V)@W > U @(V @ W) which satisfies (u@v) @wruSQ(v@w). 
Since a6 and Ba are clearly the respective identity maps, we see that a must be the 
isomorphism we seek. 


Proposition 20.14 Jf V and W are vector spaces over a field F, then 
VOWEWOV. 


Proof The bilinear transformation V x W > W ® V defined by (v, w) Hh w @v 
induces a linear transformation a from V @ W to W ® V satisfying a: v @wrh 
w ® v. Similarly, there exists a linear transformation B : W@V — V ®@ W satisfying 
B:w@vtrv ®w. Since af and Ba are clearly the respective identity maps, we 
see that ~ must be the isomorphism we seek. 


Finally, let us briefly mention two algebras built on the notion of the tensor prod- 
uct. The study of these algebras is beyond the scope of this book. However, the 
reader should be aware of them and will find it fruitful to explore them further. In 
what ensues, V is an arbitrary vector space over a field F. 
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(I) For each nonnegative integer k, we define the vector space V®* over F by 
setting V®° = V and V® = V@C-) @ V ifk >0. Let T(V) = [729 VO". We 
can define a product e on T(V) by setting (v] ® --- @ ug) © (Ug41 @ ++: @ Vy) = 
V1] ®-++@ Ves+m for all v1,..., vk+m € V and extend linearly. This is an F'-algebra, 
known as the tensor algebra of V over F. The tensor algebra has several impor- 
tant properties, one of which is that if K is any algebra over F then any linear 
transformation a: V — K can be uniquely extended to a homomorphism of F- 
algebras from T(V) to K. Moreover, if W is a vector space over F then any lin- 
ear transformation a: V — W can be uniquely extended to a homomorphism of 
F-algebras from T(V) to T(W). (In the language of category theory, this says that 
T(-) is a functor from the category of vector spaces over F to the category of 
F-algebras.) 

(II) Let Y be the subspace of V @ V generated by {uv @ uv |v eV}. Then a 
complement of Y in V @ V is called an exterior square of V and is denoted 
by V A V. This space is unique up to isomorphism. If @ is the projection of 
V ® V with image V A V and kernel Y, denote a(v ® w) by v A w. Since 
(v+tw)®(v+w)=v@®v+vGw+wGvt+w ® vw forall v, w € V, we see that 
vAw=-—w Av forall v, w € V. Therefore, if V is finitely-generated over F with 
basis {v1,..., Un}, we see that {vj Av; | 1 <i < j <n} isa basis for V A V, and 
hence dim(V A V) = (5) — 5n(n — 1). This construction can be iterated to more 
than two factors. If k > 0 is an integer, we can consider the subspace Y of V®* gen- 
erated by all expressions of the form vj ® --- ® vx in which v; = v; for some i F j. 
A complement of Y is denoted by i V and is called the kth exterior power of V. 
If V has finite dimension n, then dim(/(\* V) = (;,). In particular, we note that AkV 
is trivial when k > n. The subspace /\(V) = a fa V) of T(V) is known as the 
exterior algebra of V, and has important applications in geometry and cohomology 
theory. One can show that if (K, e) is a unital F-algebra and ifa:V — K isa lin- 
ear transformation satisfying the condition that a(v) ea(v) = Ox forall v € V, then 
a can be uniquely extended to a homomorphism of unital F-algebras from /\(V) 
to K. 


Exercises 


Exercise 1176 

Let n be a positive integer and let V be the space of all polynomials in C[X] 
of degree at most n. For p(X) = )\a;X' and q(X) = >°b;X' in V, we de- 
fine the nth Bézout matrix Bez,(f, g) € Mnxx(C) defined by f and g as fol- 
lows: Bezy(f, g) = [ci], where cj) =  laj4n—1bi-k — ai—nbj+4-1] and 
m(i, j) =min{i,n + 1 — j}. Show that the function Bez, : V x V> Mux x(C) 
is a bilinear transformation satisfying the conditions that Bez, (f, f) = O for all 
f €V.Ifn=max{deg(f), deg(g)}, show that Bez, (f, g) is nonsingular if and 
only if f and g have no common roots. 
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Etienne Bézout was an eighteenth-century French mathematician. 


Exercise 1177 


laa 1 -1 -l 
Find a € R such that the matrices | 0 1 a| and | —1 1 —1 | define 
0 0 1 -1 -l 1 


the same bilinear form in Bill(R?, R°). 


Exercise 1178 
Let V = Q2. Is the function from V x V to Q given by (f,g)t> (f + )(5) ; 
(f — g)(2) a bilinear form? 


Exercise 1179 
Let F be a field and let u : N x N > F be an arbitrary function. Is the function 
tu: FLX] x F[X]— F[X] defined by 


ties (Sax! nx") b> a > ui aid, 
j=0 


i=0 k=0 \itj=k 


a bilinear transformation? 


Exercise 1180 
Let B be the canonical basis for the vector space V = R?. Find a bilinear form 


f €Bill(V x V) satisfying the condition Tgg(f) = P = . 


Exercise 1181 
Let B be the canonical basis for R*. Find Tee (f), where f : R3 x RP > Ris 
the bilinear form defined by 


/ 


f: b|,| b’ b> aa’ + 2be' + cc! + 2cb’ — ab’ + bb’ — ba’. 
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Exercise 1182 
Let f : R? x R? > R be the bilinear form defined by 


0 1 0 
f:Q@,wrvu-}| 1 0 2] w. 
01 1 


1 0 1 
Find the matrix representing f with respect to the basis O},) 1], 
0 1 1 


of R?. 


Exercise 1183 

Let V and W be vector spaces over a field F and let a € Hom(V, W). For each 
g € Bill(W x W), let us define the bilinear form gy € Bill(V x V) by setting 
8a: (v,v') > g(a(v), a(v’)). Is the function g+> gq a linear transformation? 


Exercise 1184 

Let F be a field of characteristic other than 2 and let V be a vector space over F’. 
Let f € Bill(V x V). Show that f(v, v) £0 for all Oy #v € V if and only if 
for every nontrivial subspace W of V and for every Oy 4 w € W there exists a 
vector w’ € W satisfying f(w, w’) 40. 


Exercise 1185 
Show that if V and W are vector spaces finitely generated over a field F of 
unequal dimensions, then there is no nondegenerate f € Bill(V x W). 


Exercise 1186 
Let F be a field of characteristic 0 and let the bilinear form f € Bill(F 3 x F3) be 


a a 
defined by f : b|,| b' | | aa’ +bb' —cc’. Is there a nontrivial subspace 
c c 


W of V satisfying f(w, w’) =0 for all w, w’ € W? 


Exercise 1187 
Let f € Bill(R* x R*) be defined by f : (v, w) v- (Aw), where 


0 
0 
= 1 
0 


= oo Oo 


1 0 
0 1 
0 0 
0 0 


Find a basis {v1, v2, v3, va} of R* satisfying the condition that f(v;, vj) = 0 for 
all 1 <i <4. 
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Exercise 1188 

Let f: Maxn(F) x Mnixn(F) > F be the function defined by f : (A, B) Bb 
tr(AB), where F is a field and n is a positive integer. Is f a bilinear form? Is f 
symmetric? 


Exercise 1189 
Let n be a positive integer and let f : Mnxn(C) x Mnxn(C) > C be the func- 
tion defined by f : (A, B)  n-tr(AB) —tr(A) tr(B). Show that f is a symmetric 
bilinear form. 


Exercise 1190 
Let V be a vector space over a field F and let f € Bill(V x V) be a symmetric 
bilinear form. Let Y = F x V and define an operation e on Y by setting 


Behe eee for alla,b € F andallv,weV. 
v w aw+ bu 


Show that (Y, e) is a Jordan algebra. 


Exercise 1191 
Let V bea vector space over Q. Is the function V x V > V defined by (v, v’) B 
v+v’ a bilinear transformation? 


Exercise 1192 


0 0 0 25 -—5 35 
Are the matrices | | 1 0 and 35 0 -—3 21 | in.M3,.3(R) congruent? 
111 0 -4 28 
Exercise 1193 
—2 
Find an upper-triangular matrix in. M3,.3(R) congruent to | —1 1 0 | or 
- 4 


show that there is no such matrix. 


Exercise 1194 

Let F bea field and let n be a positive integer. A matrix A = [aj] € Myxn(F) is 
an upper Hessenberg matrix if and only if a;; =0 whenever i — j > 2. Is every 
matrix in M,,(R) necessarily congruent to an upper Hessenberg matrix? 


© Brigitte Bossert. 


Karl Hessenberg was a twentieth-century German engineer. 
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Exercise 1195 
Let F be a field. Show that every upper triangular matrix in. M3,.3(F) is congru- 
ent to a lower triangular matrix. 


Exercise 1196 
Let 1 be a positive integer and let A be a nonsingular symmetric matrix in 
Mnxn(C). Show that A is congruent to A>}. 


Exercise 1197 


Find a matrix P € M3,3(IR) such that the matrix P P’ is diagonal. 


Wr. dv 
- Oe 
OW me UW 


Exercise 1198 


1111 
11411 . : : 

Let A= 1111I/€ Ma 4(R). Find a nonsingular matrix P € Ma4,4(R) 
11411 

such that PAP? is diagonal. 


Exercise 1199 
Find a diagonal matrix in. M.4,4(R) congruent to the matrix 


12 3 2 
2 3 4 8 
3 5 8 10 
2 8 10 -8 
Exercise 1200 
1 i 1+i 
Is the matrix i 0 2—i | €M3x3(C) congruent to I? 


1+i 2-i 10+4+2i 


Exercise 1201 

Let n be a positive integer and let @ be a positive-definite endomorphism of R” 
represented with respect to the canonical basis by the matrix A. If A’ is a matrix 
congruent to A, does it too represent a positive-definite endomorphism of R” 
with respect to the canonical basis? 


Exercise 1202 

Let V be a vector space finitely generated over be a field F' of characteristic other 
than 2. If f € Bill(V x V) is symmetric and not the 0-function, show that there 
exists a vector uv € V satisfying f(v, v) 40. 
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Exercise 1203 

Let V be a vector space finitely generated over be a field F of characteristic 
other than 2. If f € Bill(V x V), show that f(v, v) = 0 for all v € V if and only 
if f(v, w) =—f(w, v) forall v,weV. 


Exercise 1204 

Let n be a positive integer, let F be a field, and let A € Myxn(F). Show that 
there exists a symmetric matrix B € My x,(F) satisfying v- Av =v- Bv for all 
ve F", 


Exercise 1205 


Find a bilinear form f € Bill] R? x | IR? | which defines the quadratic 


a 
form | b | b a* — 2ab 4+ 4ac — 2be + 2c?. 
Cc 


Exercise 1206 
Let f € Bill(R?, IR?) be the symmetric bilinear form defined by the matrix 


—3 1 0 
1 -—6 1 |. Find the quadratic form defined by f. 
0 i “7 


Exercise 1207 
Let f € Bill(R?, IR?) be the symmetric bilinear form defined by the matrix 


2 -1 5 
-1 : ; . Find the quadratic form defined by f/f. 
5 g =3 
3 


Exercise 1208 

Find a symmetric bilinear form f € Bill(R*,R*) which defines the quadratic 
a 

form | b | +» 2ab+ 4ac + 6bc. 
c 


Exercise 1209 

Let F be a field of characteristic other than 2, and let V be a vector space over F’. 
Let g: V — F bea function satisfying the condition that g(v-+w)+q(v—w) = 
2q(v) + 2q(w) for all v, w € V. Show that the function f: V x V > F defined 
by f:Q,w) bh ila (v+ w) — q(v — w)] is a symmetric bilinear form. 
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Exercise 1210 

Let V be a vector space over a field F of characteristic other than 2, and let 
f € Bill(V x V) be a symmetric bilinear form which defines a quadratic form 
q:V — F. Show that 


qut+tvut+w)=qutv)+qut+w)+q(u+w)—qu)—g(v)—-—qw) 
for allu,v,weV. 


Exercise 1211 
Let V be a vector space over a field F. Show that V=F@V. 


Exercise 1212 
Let V and W be vector spaces over a field F. Let x € V ®@ W be written in the 
form x = pa vj ® w;, where n is minimal in the sense that there is no way to 


express x in the form ys uv, @ w; for any k <n. Show that {v1,..., Un} is a 
linearly-independent subset of V and that {w1, ..., wy} is a linearly-independent 
subset of W. 


Exercise 1213 
Let K be a field containing F as a subfield. If V is a vector space over F’, show 
that K @ V is a vector space over K. 


Exercise 1214 

Let V be a vector space of finite dimension n over a field F and let Y be the 
subspace of V @ V generated by all elements of the form v ® v’ — v’ @ v, where 
v,v’ € V. Find the dimension of Y. 


Exercise 1215 

Let V and W be finite dimensional vector spaces over a field F. Let v, v' € V 
and w, w’ € W be vectors satisfying the condition v @ w = v’ ® w’ and this is not 
the identity element of V © W with respect to addition. Show that there exists a 
scalar c € F such that v=cv’ and w’ =cw. 


Exercise 1216 

Let F be a field and, for all A, B € M2x2(F), denote the Kronecker product 
of A and B by A@ B. If {M),..., Ha} is the canonical basis for M2,2(F), is 
{H; @ Hj | 1 <i, j <4} a basis for M4x4(F). 


Exercise 1217 
Find the numerical range of the quadratic form g : R? — R defined by g: vb 
a ae v 
0 0} - 
Exercise 1218 


Let n be a positive integer and let F be a field. If A € Myyn(F) is a magic 
matrix, is the same true for A @ A € Moy x2n(F)? 
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Exercise 1219 

Let F be a field and let k and n be positive integers. If matrices A € Mxxx(F) 
and B € Mnxn(F) have eigenvalues a and b, respectively, show that ab is an 
eigenvalue of A @ B. 


Exercise 1220 

Let F be a field and let k and n be positive integers. If matrices A € Mxxx(F) 
and B € Myxn(F) have eigenvalues a and Db respectively, find a matrix 
C EMinxkn(F) with eigenvalue a + b. 


Exercise 1221 

Let F be a field of characteristic other than 2 and let V be a vector space over F’. 
Find the minimal polynomial of the endomorphism a of V @ V defined by 
a: Pai (vj @ wi) Y_, aj (w; @ yj). 


Exercise 1222 

Let F be a field, let k,n,s, and t be positive integers, and consider matrices 
AE Mexn(F) and B € Ms,;(F). Is the rank of A ® B necessarily equal to the 
product of the ranks of A and B? 


Exercise 1223 

Let V, V’, W, W’ be vector spaces over a field F and let a: V — V’ and 
B:W-— W’ be monic linear transformations. Let a ® B be the linear trans- 
formation from V ®@ V’ to W @ W’ defined by a @ B: Y°"_, ai(vji ® v;) > 
Vix Gila(v;) ® B(v{)]. Is @ ® B monic? 


Exercise 1224 

Let F be a field and let (K,e) and (L, *) be F-algebras. Define an operation © 
on V @ W by setting (v @ w) > (v' @w’) = (vev’) @(w *w’) forall v,v' EK 
and w, w’ € L. Is (K ® L, ©) an F-algebra? 


Exercise 1225 


Let V = R? and let W = V ® V. If w € W is normal, do there necessarily exist 
normal vectors v, v’ € V such that w = v @ v’? 


Exercise 1226 
Let V be a vector space over IR. Show that the complexification of V is isomor- 
phic to C@V. 


Exercise 1227 

Let V be an inner product space over R having a basis {v; | i € Q} and let W 
be an inner product space over R having a basis {w; | j €¢ A} Define a function 
U:(V @W) x (V @ W) > R by setting 
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bh (x a ajj (vj ® wy), ~~ > bij (v; ® 7) 


ieQ jeA ieQ jedA 
YOY aijbij[(vi, vi) + (w;, w5)).- 
ieQ jer 


Is yz an inner product on V @ W? 


Exercise 1228 

Let V be a vector space over a field F and let a € End(V). Is the function 
VAV—>V AV defined by )7"_, c)(uj A wi) BH OL, cj (a (uj) A @(u;)) a lin- 
ear transformation? 


Exercise 1229 
Let n be a positive integer and let A, B € M,,x,(R) be orthogonal matrices. Is 
their Kronecker product A @ B an orthogonal matrix? 


Exercise 1230 
Let n be a positive integer and let A, B € My x»(R) be permutation matrices. Is 
their Kronecker product A @ B a permutation matrix? 


Exercise 1231 
Let k and n be positive integers and let F be a field. Let A € Mx x (F) and let 
Be Mnhyxn(F). Is it necessarily true that tr(A ® B) = tr(A) tr(B)? 


Exercise 1232 
Let k and n be positive integers and let F be a field. For A € Myx (F) and 
Be Mn xn(F), find a matrix C € Minxkn(F) such that e =e @eF, 


Exercise 1233 


The matrix plays an important part in quantum information the- 


oocor 
oroe 
ooroeo 
eer) 


ory. Write this matrix as a sum of Kronecker products of : | and the three 


Pauli matrices. 
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Opposite transformation, 453 
Optimization algebra, 9 
Order of recurrence, 298 
Orderable field, 18 
Orthogonal, 369 
Orthogonal complement, 374 
right, 458 
Orthogonal matrix, 422 
Orthogonal projection, 374 
Orthogonality with respect to a bilinear form, 
458 
Orthogonally diagonalizable, 398 
Orthonormal, 375 


P 
Padé approximant, 239 
Pairwise disjoint, 27 
Parallelogram law, 339 
Parity, 456 
Parseval’s identity, 380 
Partial order, 60 
Loewner, 404 
Partial pivoting, 168 
Partially-ordered set, 60 
Pauli matrices, 82 
Periodic function, 69 
Permanent, 252 
Permutation, 2 
even, 224 
odd, 224 
Permutation matrix, 158 
Pfaffian, 228 
Piecewise constant, 34 
Pivot, 168 
Pivoting 
full, 168 
partial, 168 
Poincaré—Birkhoff—Witt Theorem, 42 
Polar decomposition, 431 
Polarization identity 
complex, 363 
real, 363 
Polynomial, 44 
characteristic, 264, 298 
Chebyshev, 370 
completely reducible, 49 


Index 


Polynomial (cont.) 

cyclotomic, 48 

flat, 51 

in several indeterminates, 50 

irreducible, 47 

Jacobi, 371 

Lagrange interpolation, 162 

Legendre, 370 

minimal, 273, 298 

monic, 44 

reciprocal, 440 

reducible, 47 

trigonometric, 56 

zero, 44 
Polynomial function, 47 
Positive definite, 403 
Positive quadratic form, 462 
Positive semidefinite, 403 
Power 

exterior, 470 
Pre-Banach space, 342 
Pre-Hilbert space, 333 
Primitive root of unity, 152 
Principal component analysis, 118 
Process 

Arnoldi, 375 

Gram-—Schmidt, 372 
Product 

bra-ket, 136 

Cartesian, 2 

cross, 42 

direct, 23 

dot, 334 

dyadic, 145 

exterior, 136 

Hadamard, 174 

inner, 333 

interior, 136 

Jordan, 43 

ket-bra, 136 

Kronecker, 467 

Lie, 42 

scalar triple, 339 

Schur, 174 

tensor, 465 

vector triple, 339 
Projection, 118 

onto an affine set, 387 

orthogonal, 374 
Proper subspace, 25 
Pseudoinverse 

Drazin, 450 

Moore-Penrose, 441 


Q 
QR algorithm, 380 
QR-decomposition, 379 
Quadratic form, 461 
positive, 462 
Quadratic surface, 464 
Quasidefinite matrix, 415 
Quaternion 
algebra, 66 
real, 66 
QZ algorithm, 380 


R 

Range, 2 
numerical, 463 

Rank, 98, 199, 457 


Rational Decomposition Theorem, 304 


Rational number, 5 
Rayleigh quotient 

function, 400 

iteration scheme, 400 
Real Euclidean, 333 
Real number, 5 
Real part, 7 
Real polarization identity, 363 
Real quaternion, 66 
Reciprocal polynomial, 440 


Reduced row echelon form, 194 


Reducible, 47 
Relation 
equivalence, 120 
partial order, 60 
Relaxation method, 207 
Representation, 153 
Restriction, 2 


Riesz Representation Theorem, 382 
Right orthogonal complement, 458 


Row echelon form, 193 
Row equivalent, 161 
Row space, 199 


s 

Scalar, 22 

Scalar matrix, 148 

Scalar multiplication, 21 
Scalar triple product, 339 
Schur complement, 160 
Schur product, 174 
Schur’s Theorem, 421 
Selfadjoint, 395 
Semifield, 9 

Semisimple eigenvalue, 270 
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Sequence, | 
Fibonacci, 299 
linearly recurrent, 298 
Set 
difference, 2 
generating, 28 
partially-ordered, 60 
spanning, 28 
Sherman—Morrison—Woodbury Theorem, 154 
Signum, 224 
Similar matrices, 266 
Simple eigenvalue, 270 
Simple tensors, 465 
Simpson’s rule, 187 
Singular matrix, 151 
Singular value, 432 
Singular Value Decomposition Theorem, 431 
Skew symmetric bilinear form, 453 
Skew symmetric matrix, 150 
Solution set, 191 
Solution space, 191 
SOR, 207 
Space 
dual, 317 
inner product, 333 
normed, 342 
pre-Banach, 342 
pre-Hilbert, 333 
solution, 191 
Spanning set, 28 
Sparse matrix, 167 
Special Lie algebra, 320 
Special orthogonal matrix, 424 
Spectral condition number, 432 
Spectral Decomposition Theorem, 429 
Spectral norm, 344 
Spectral radius, 258 
Spectrum, 255 
Spline function, 56 
Stabilizer, 131 
Standard identity, 238 
Stationary iteration method, 208 
Steinitz Replacement Property, 60 
Stochastic matrix, 150 
Strassen—Winograd algorithm, 166 
Strictly diagonally dominant, 249 
Subalgebra, 41 
unital, 41 
Subfield, 6 
Euclidean, 333 
real Euclidean, 333 
Subset 
affine, 96 
bounded, 67 


chain, 61 
convex, 412 
Hilbert, 376 
orthonormal, 375 
underlying, | 


Subspace, 25 


cyclic, 117 
fuzzy, 37 
generated by, 28 
improper, 25 
invariant, 117 
Krylov, 297 
maximal, 325 
nontrivial, 25 
proper, 25 
spanned by, 28 
trivial, 25 


Subspaces 


disjoint, 27 
independent, 74 


pairwise disjoint, 27 
Successive overrelaxation method, 207 
Sylvester’s Theorem, 98 


Symmetric 


bilinear transformation, 453 


matrix, 150 


Toeplitz matrix, 202 
with respect to an involution, 385 
Symmetric difference, 24 


Symplectic matrix, 440 


System of linear equations, 189 


T 


homogeneous, 190 


nonhomogeneous, 190 


Taber’s Theorem, 321 
Taylor coefficient, 117 
Tensor algebra, 470 
Tensor product, 465 
Tensors 


entangled, 465 
simple, 465 


Ternary ring, 18 
Trace, 318 
Transcendental, 73 
Transform 


discrete cosine, 152 


discrete Fourier, 152, 341 


fast Fourier, 152 


Transformation 


affine, 96 
bilinear, 453 
linear, 89 
opposite, 453 
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Index 


Transpose, 95 

conjugate, 334 

Hermitian, 334 
Triangle difference inequality, 339 
Triangle inequality, 353 
Triangular norm, 37 
Tridiagonal matrix, 149 
Trigonometric polynomial, 56 
Trivial subspace, 25 


nderlying subset, | 

niform combination, 38 
niform hull, 38 

nion, | 

nit, 40 

nital algebra, 39 

nital subalgebra, 41 

nitarily similar matrices, 420 
nitary automorphism, 419 
nitary matrix, 419 

pper Hessenberg matrix, 473 
pper-triangular matrix, 149 


cell EG Cle ere oe hE 


Vv 
Vandermonde matrix, 163 
Variety 
linear, 96 
Vector, 22 
normal, 338 
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Vector addition, 21 
Vector space, 21 
finite dimensional, 71 
finitely generated, 29 
infinite dimensional, 71 
Vector triple product, 339 
Vectors 
orthogonal, 369 
orthogonal with respect to a bilinear form, 
458 


Ww 
Wavelet 

Haar, 376 
Weak dual space, 323 
Weight function, 318 
Weight of a Baxter algebra, 90 
Weighted dot product, 335 
Well Ordering Principle, 61 
Weyl’s Problem, 399 
Width of a band matrix, 149 
Wiedemann algorithm, 301 
Word, 23 
Wronskian, 228 


Z 

Zero-sum combination, 38 
Zero-sum hull, 38 
Zlobec’s formula, 445 
Zorn’s Lemma, 67 


